Search

Python Programming Filter

How to Work with Excel Spreadsheets using Python

Excel is considered as one of the most popular and widely used spreadsheet applications developed by Microsoft. You can organize, analyze and store your data into tabular sheets with the help of Excel. From analysts and sales managers, to CEOs, professionals from every field use Excel for creating quick statistics and for data crunching.Spreadsheets are commonly used in the present world because of their intuitive nature and the ability to handle large datasets. Most importantly, they can work without any prior technical background.Finding different ways to work with Excel using code is essential since working with data and in Python has some serious advantages in comparison with Excel’s UI. Developers of Python have implemented ways to read, write and manipulate Excel documents.You can check the quality of your spreadsheet application by going over the checklist below:Is the spreadsheet able to represent static data?Is the spreadsheet able to mix data, calculations, and reports?Is the data in your spreadsheet complete and consistent in nature?Does the spreadsheet have an organized worksheet structure?This checklist will help you in verifying the qualitative nature of the spreadsheet application you’re going to work on.Practical Applications  In this article, we would be using openpyxl to work on data. With the help of this module, you can extract data from a database into an Excel spreadsheet or you can also convert an Excel spreadsheet into a programmatic format. There can be a lot of possible situations where you might feel the need to use a package like openpyxl. Let us discuss a few of them to get a comprehensive overview of it.Importing New Products Into a Database Consider yourself working in an online store company. When they want to add new products to the online store, they make an Excel spreadsheet with a few hundred rows along with the name of the product, description, price and a few more basic information and then they give it to you. Now, if you want to import this particular data, you need to iterate over each row of the spreadsheet and then add each of the products into the database of the online store.Exporting Database Data Into a SpreadsheetConsider you have a Database table. In this particular table, you have collected information of all your users which includes their name, contact number, email address, and so forth. Now, the Marketing Team is willing to collectively contact all the users and promote a new product of the company. However, neither do they have access to the Database nor they have any idea about using SQL to extract the information. In this situation, openpyxl comes to play. You can use it effectively to iterate over each User record and transform the required information into an Excel spreadsheet.    Appending Information to an Existing SpreadsheetConsider the same online store example we discussed above. You have an Excel spreadsheet with a list of users and your job is to append to each row the total amount they have spent in your store.In order to perform this, you have to read the spreadsheet first and then iterate through each row and fetch the total amount spent from the Database. Finally, you need to write it back to the spreadsheet.Starting openpyxlYou can install the openpyxl package using pip. Open your terminal and write the following command: $ pip install openpyxlAfter you have installed the spreadsheet, you can make up your own simple spreadsheet: from openpyxl import Workbook workbook = Workbook() spreadsheet = workbook.active spreadsheet["A1"] = "Hello" spreadsheet["B1"] = "World!" workbook.save(filename="HelloWorld.xlsx")How to Read Excel Spreadsheets with openpyxl Let us start with the most important thing that you can do with a spreadsheet,i.e. read it. We will be using a Watch Sample Dataset which contains a list of 100 watches with information like product name, product ID, review and so forth.  A Simple Way to Read an Excel Spreadsheet Let us start with opening our sample spreadsheet:>>> from openpyxl import load_workbook >>> workbook = load_workbook(filename="sample.xlsx") >>> workbook.sheetnames ['Sheet 1'] >>> spreadsheet = workbook.active >>> spreadsheet <Worksheet "Sheet 1"> >>> spreadsheet.titleIn the example code above, we open the spreadsheet using load_workbook and then we check all the sheets that are available to work with using workbook.sheetnames. Then Sheet 1 is automatically selected using workbook.active since it is the first sheet available. This is the most common way of opening a spreadsheet.  Now, let us see the code to retrieve data from the spreadsheet: >>> spreadsheet["A1"] <Cell 'Sheet 1'.A1> >>> spreadsheet["A1"].value 'marketplace' >>> spreadsheet["F10"].value "G-Shock Men's Grey Sport Watch"You can retrieve the actual value and the cell value  both. To get the actual value, use .value and to get the cell, you can use .cell():>>> spreadsheet.cell(row=10, column=6) <Cell 'Sheet 1'.F10> >>> spreadsheet.cell(row=10, column=6).value "G-Shock Men's Grey Sport Watch"Importing Data from a Spreadsheet In this section, we will discuss how to iterate through the data, and about conversion into a more useful format using Python.Let us first start with iterating through the data. There are a number of iterating methods that depend solely on the user.You can slice the data with a combination of rows and columns:>>> spreadsheet["A1:C2"] ((<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.C1>),  (<Cell 'Sheet 1'.A2>, <Cell 'Sheet 1'.B2>, <Cell 'Sheet 1'.C2>)) You can also iterate through the dataset by ranging between rows and columns: >>> # Get all cells from column A  >>> spreadsheet["A"] (<Cell 'Sheet 1'.A1>,  <Cell 'Sheet 1'.A2>,   ...   <Cell 'Sheet 1'.A99>,   <Cell 'Sheet 1'.A100>) >>> # Get all cells for a range of columns >>> spreadsheet["A:B"]  ((<Cell 'Sheet 1'.A1>,    <Cell 'Sheet 1'.A2>,    ...    <Cell 'Sheet 1'.A99>,    <Cell 'Sheet 1'.A100>),   (<Cell 'Sheet 1'.B1>,    <Cell 'Sheet 1'.B2>,    ...    <Cell 'Sheet 1'.B99>,    <Cell 'Sheet 1'.B100>))  >>> # Get all cells from row 5 >>> spreadsheet[5] (<Cell 'Sheet 1'.A5>,  <Cell 'Sheet 1'.B5>,  ...   <Cell 'Sheet 1'.N5>,  <Cell 'Sheet 1'.O5>) >>> # Get all cells for a range of rows >>> spreadsheet[5:6] ((<Cell 'Sheet 1'.A5>,   <Cell 'Sheet 1'.B5>,    ...    <Cell 'Sheet 1'.N5>,    <Cell 'Sheet 1'.O5>),   (<Cell 'Sheet 1'.A6>,    <Cell 'Sheet 1'.B6>,    ...    <Cell 'Sheet 1'.N6>,    <Cell 'Sheet 1'.O6>)) Python offers arguments by which you can set limits to the iteration with the help of Python generators like .iter_rows() and .iter_cols(): >>> for row in spreadsheet.iter_rows(min_row=1, ... max_row=2, ... min_col=1, ... max_col=3): ... print(row) (<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.C1>) (<Cell 'Sheet 1'.A2>, <Cell 'Sheet 1'.B2>, <Cell 'Sheet 1'.C2>) >>> for column in spreadsheet.iter_cols(min_row=1,  ... max_row=2, ... min_col=1, ... max_col=3): ... print(column) (<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.A2>) (<Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.B2>)  (<Cell 'Sheet 1'.C1>, <Cell 'Sheet 1'.C2>) You can also add Boolean values_only in the above example and set it to True to get the values of cell: >>> for value in spreadsheet.iter_rows(min_row=1,  ... max_row=2,  ... min_col=1,  ... max_col=3,  ... values_only=True): ... print(value) ('marketplace', 'customer_id', 'review_id') ('US', 3653882, 'R3O9SGZBVQBV76')Since we are now done with iterating the data, let us now manipulate data using Python’s primitive data structures. Consider a situation where you want to extract information of a product from the sample spreadsheet and then store it into the dictionary. The key to the dictionary would be the product ID.   Convert Data into Python classesTo convert data into Python data classes, let us first decide what we want to store and how to store it.  The two essential elements that can be extracted from the data are as follows:                                                     1. Products                                             2. Review                                                          • ID                                                         • ID                                                          • Title                                                     • Customers ID                                                          • Parent                                                 • Headline                                                          • Category                                            • Body                                                                                                                         • DateLet us implement the two elements: import datetime from dataclasses import dataclass @dataclass class Product: id: str parent: str title: str category: str @dataclass class Review: id: str customer_id: str stars: int headline: str body: str  date: datetime.datetime The next step is to create a mapping between columns and the required fields: >>> for value in spreadsheet.iter_rows(min_row=1, ... max_row=1, ... values_only=True): ... print(value) ('marketplace', 'customer_id', 'review_id', 'product_id', ...) >>> # Or an alternative >>> for cell in sheet[1]: ... print(cell.value) marketplace Customer_ID Review_ID Product_ID Product_Parent ...Finally, let us convert the data into new structures which will parse the data in spreadsheet into a list of products and review objects: from datetime import datetime  from openpyxl import load_workbook  from classes import Product,Review  from mapping import PRODUCT_ID,PRODUCT_PARENT,PRODUCT_TITLE, \ PRODUCT_CATEGORY,REVIEW_DATE,REVIEW_ID,REVIEW_CUSTOMER, \ REVIEW_STARS,REVIEW_HEADLINE,REVIEW_BODY # Using the read_only method since you're not gonna be editing the spreadsheet workbook = load_workbook(filename="watch_sample.xlsx",read_only=True)  spreadsheet = workbook.active products = [] reviews = [] # Using the values_only because you just want to return the cell value for row in spreadsheet .iter_rows(min_row=2, values_only=True):  product = Product(id=row[PRODUCT_ID],  parent=row[PRODUCT_PARENT],  title=row[PRODUCT_TITLE],  category=row[PRODUCT_CATEGORY])  products.append(product) # You need to parse the date from the spreadsheet into a datetime format spread_date = row[REVIEW_DATE]  parsed_date = datetime.strptime(spread_date,"%Y-%m-%d") review = Review(id=row[REVIEW_ID], Customer_ID=row[REVIEW_CUSTOMER], stars=row[REVIEW_STARS], headline=row[REVIEW_HEADLINE], body=row[REVIEW_BODY], date=parsed_date) reviews.append(review) print(products[0]) print(reviews[0])After you execute the code, you will get an output that looks like this:Product(id='A90FALZ1ZC',parent=937111370,...) Review(id='D3O9OGZVVQBV76',customer_id=3903882,...)Appending Data To understanding how to append data, let us hover back to the first sample spreadsheet. We will open the document and append some data to it: from openpyxl import load_workbook # Start by opening the spreadsheet and selecting the main sheet workbook = load_workbook(filename="hello_world.xlsx") spreadsheet = workbook.active # Write what you want into a specific cell spreadsheet["C1"]="Manipulating_Data ;)" # Save the spreadsheet workbook.save(filename="hello_world_append.xlsx"If you open your Excel file, you will notice the additional Manipulating_Data being added to an adjacent cell. Writing Excel Spreadsheets With openpyxl A spreadsheet is a file that helps to store data in specific rows and columns. We can calculate and store numerical data and also perform computation using formulas. So, let’s begin with some simple Spreadsheets and understand what each line means. Creating our first simple Spreadsheet 1 from openpyxl import Workbook  2    3 filename = "first_program.xlsx"  4    5 workbook = Workbook()  6 spreadsheet = workbook.active  7    8 sheet["A1"] = "first"  9 sheet["B1"] = "program!" 10   11 workbook.save(filename=filename)Line 5: In order to make a Spreadsheet, at first,  we have to create an Empty workbook to perform further operations. Lines 8 and 9 : We can add data to a specific cell as per our requirement. In this example, we can see that two values “first” and “program” have been added to specific cells in the sheet. Line 11: The line shows how to save data after all the operations we have done. Basic Spreadsheet Operations Before going to the difficult coding part, at first we have to build our building blocks like how to add and update values, how to manage rows and columns, adding filters, styles or formulas in a Spreadsheet. We have already explained the following code by which we can add values to a Spreadsheet: >>> spreadsheet["A1"] = "the_value_we_want_to_add"There is another way that we can add values to Spreadsheet: >>> cell = sheet["A1"] >>> cell <Cell 'Sheet'.A1> >>> cell.value 'hello' >>> cell.value = "hey" >>> cell.value 'hey'Line 1: In the first line at first we have declared the cell and updated its value. Line 5: We have printed the value of the cell as “first”  because  in the first program we have already assigned sheet["A1"]with “first” Line 8 : We have updated the value of the cell as "second"by simply assigning it to cell.value. Lines 9 : In this line, we have just printed the updated value of cell. Finally, you have to save all the operations you have performed into the spreadsheet once you call workbook.save().If  the cell didn’t exist while adding a value ,then openpyxl creates a cell:>>> # Before, our spreadsheet has only 1 row >>> print_rows() ('first', 'program!') >>> # Try adding a value to row 10 >>> spreadsheet["B10"] = "test" >>> print_rows() ('first', 'program!') (None, None)  (None, None)  (None, None)  (None, None)  (None, None)  (None, None)  (None, None)  (None, None)  (None, 'test') Managing Rows and Columns in Spreadsheet Insertion or deletion of rows (adding or removing elements of rows /columns) is one of the most basic operations in Spreadsheet. In openpyxl.We can perform these operations by simply calling these methods and passing its arguments. .insert_rows().delete_rows().insert_cols().delete_cols()We can pass 2 types of arguments to the methods :  idx amount Idx stands for index position and amount refers to the number of values we can store in the Spreadsheet. Using our basic knowledge based on the first  simple program, let’s see how we can use these methods inside the program: >>> print_rows() ('first', 'program!') >>> # Insert a column at the first position before column 1 ("A") >>> spreadsheet.insert_cols(idx=1) >>> print_rows() (None, 'first', 'program!') >>> # Insert 5 columns in  between column 2 ("B") and 3 ("C") >>> spreadsheet.insert_cols(idx=3,amount=5) >>> print_rows() (None, 'first', None, None, None, None, None, 'program!') >>> # Delete the created columns >>> spreadsheet.delete_cols(idx=3,amount=5) >>> v.delete_cols(idx=1) >>> print_rows() ('first', 'program!') >>> # Insert a new row in the beginning >>> spreadsheet.insert_rows(idx=1) >>> print_rows() (None, None) ('first', 'program!') >>> # Insert 3 new rows in the beginning  >>> spreadsheet.insert_rows(idx=1,amount=3) >>> print_rows() (None, None) (None, None)  (None, None)  (None, None)  ('first', 'program!') >>> # Delete the first 4 rows  >>> spreadsheet.delete_rows(idx=1,amount=4)  >>> print_rows()  ('first', 'program!') Managing SheetsWe have seen the following recurring piece of code in our previous examples .This is one of the ways of selecting the default sheet from the Spreadsheet: spreadsheet = workbook.activeHowever, if you want to open a spreadsheet with multiple sheets, you can write the following command: >>> # Let's say you have two sheets: "Products" and "Company Sales" >>> workbook.sheetnames ['Products', 'Company Sales'] >>> # You can select a sheet using its title >>> Products_Sheet = workbook["Products"] >>> Sales_sheet = workbook["Company Sales"]If we want to change the title of the Sheet, execute the following code: >>> workbook.sheetnames ['Products', 'Company Sales'] >>> Products_Sheet = workbook["Products"] >>> Products_Sheet.title = "New Products" >>> workbook.sheetnames ['New Products', 'Company Sales']We can CREATE / DELETE Sheets also with the help of two methods - .create_sheet() and  .remove(): >>> #To print the available sheet names >>> workbook.sheetnames  ['Products', 'Company Sales'] >>> #To create a new Sheet named "Operations" >>> Operations_Sheet = workbook.create_sheet("Operations") >>> #To print the updated available sheet names >>> workbook.sheetnames ['Products', 'Company Sales', 'Operations'] >>> # To define the position Where we want to create the Sheet(here “HR” sheet is created at the first position .Here index 0 represents the first position) >>> HR_Sheet = workbook.create_sheet("HR",0) >>> #To again  print the updated available sheet names >>> workbook.sheetnames ['HR', 'Products', 'Company Sales', 'Operations'] >>> # To remove them,we just have to send the sheet names as an argument which we want to delete to the method  .remove()  >>> workbook.remove(Operations_Sheet) >>> workbook.sheetnames ['HR', 'Products', 'Company Sales'] >>> #To delete hr_sheet >>> workbook.remove(hr_sheet) >>> workbook.sheetnames ['Products', 'Company Sales']Adding Filters to the Spreadsheet We can use openpyxl to add filters in our Spreadsheet but when we open our Spreadsheet, the data won’t be rearranged according to these sorts and filters. When you’re programmatically creating a spreadsheet and it is going to be sent and used by someone else, it is a good practice to add different filters and allow people to use it afterward. In the code below there is a simple example which shows how to add a simple filter to your spreadsheet: >>> # Check the used spreadsheet space using the attribute "dimensions" >>> spreadsheet.dimensions 'A1:O100' >>> spreadsheet.auto_filter.ref="A1:O100" >>> workbook.save(filename="watch_sample_with_filters.xlsx")Adding Formulas to the Spreadsheet Formulas are one of the most commonly used and powerful features of spreadsheets. By using formulas, you can solve various mathematical equations with the additional support of openpyxl which makes those calculations as simple as editing a specific cell’s value.The list of formulas supported by openpyxl are:>>> from openpyxl.utils import FORMULAE >>> FORMULAE frozenset({'ABS',            'AMORLINC',            'ACCRINT',             'ACOS',             'ACCRINTM',             'ACOSH',              ...,                   'AND',            'YEARFRAC',             'YIELDDISC',             'AMORDEGRC',             'YIELDMAT',             'YIELD',             'ZTEST'}) Let’s add some formulas to our spreadsheet. Let’s check the average star rating of  the 99 reviews within the spreadsheet: >>> # Star rating is in column "H"  >>> spreadsheet["P2"] = "=AVERAGE(H2:H100)" >>> workbook.save(filename = "first_example.xlsx")Now, if we open your spreadsheet and go to cell P2, you can see the value to be 4.18181818181818.  Similarly, we can use this methodology to include any formulas for our requirements in our spreadsheet. For example, if we want to count the number of helpful reviews: >>> # The helpful votes  counted in column "I"  >>> spreadsheet["P3"] = '=COUNTIF(I2:I100, ">0")' >>> workbook.save(filename = "first_example.xlsx") Adding Styles to the SpreadsheetIt is not so important and usually, we don’t use this in everyday code but for the sake of completeness, we will also understand this with the following example.Using openpyxl, we get multiple styling options such as including fonts, colors,  borders,and so on.Let’s have a look at an example:>>> # Import necessary style classes >>> from openpyxl.styles import Font,Color,Alignment,Border,Side,colors >>> # Create a few styles >>> Bold_Font = Font(bold=True) >>> Big_Red_Text = Font(color=colors.RED,size=20)  >>> Center_Aligned_Text = Alignment(horizontal="center")  >>> Double_Border_Side = Side(border_style="double")  >>> Square_Border = Border(top=double_border_side,  ... right=double_border_side,  ... bottom=double_border_side,  ... left=double_border_side)  >>> # Style some cells! >>> spreadsheet["A2"].font = Bold_Font >>> spreadsheet["A3"].font = Big_Red_Text >>> spreadsheet["A4"].alignment = Center_Aligned_Text >>> spreadsheet["A5"].border = Square_Border >>> workbook.save(filename="sample_styles.xlsx") If you want to apply multiple styles to one or several cells in our spreadsheets,you can use  NamedStyle class: >>> from openpyxl.styles import NamedStyle >>> # Let's create a style template for the header row >>> header = NamedStyle(name="header") >>> header.font = Font(bold=True) >>> header.border = Border(bottom=Side(border_style="thin")) >>> header.alignment = Alignment(horizontal="center",vertical="center") >>> # Now let's apply this to all first row (header) cells >>> header_row = sheet[1] >>> for cell in header_row: ... cell.style = header >>> workbook.save(filename="sample_styles.xlsx") Adding Charts to our SpreadsheetCharts are a good way to compute and understand large amounts of data quickly and easily. We have a lot of charts such as bar chart, pie chart, line chart, and so on. Let us start by creating a new workbook with some data:  1 from openpyxl import Workbook   2 from openpyxl.chart import BarChart,Reference   3    4 workbook = Workbook()   5 spreadsheet = workbook.active   6    7 # Let's create some sample sales data  8 rows = [   9    ["Product","Online","Store"],  10    [1,30,45],  11    [2,40,30],  12    [3,40,25],  13    [4,50,30],  14    [5,30,25],  15    [6,25,35],  16    [7,20,40],  17 ]  18   19 for row in rows: 20    spreadsheet .append(row)Now let us create a bar chart that will show the total number of sales per product: 22 chart = BarChart() 23 data = Reference(worksheet=sheet, 24                 min_row=1,  25                 max_row=8,  26                 min_col=2,  27                 max_col=3)  28   29 chart.add_data(data,titles_from_data=True) 30 spreadsheet .add_chart(chart, "E2") 31 32 workbook.save("chart.xlsx")You can also create a line chart by simply making some changes to the data:  1 import random   2 from openpyxl import Workbook   3 from openpyxl.chart import LineChart,Reference   4    5 workbook = Workbook()  6 sheet = workbook.active  7    8 # Let's create some sample sales data   9 rows= [ 10    ["", "January", "February", "March", "April",  11    "May", "June", "July", "August", "September",  12     "October", "November", "December"],  13    [1, ], 14    [2, ], 15    [3, ], 16 ]  17   18 for row in rows:  19    sheet.append(row) 20   21 for row in sheet.iter_rows(min_row=2, 22                           max_row=4, 23                           min_col=2, 24                           max_col=13): 25    for cell in row: 26        cell.value = random.randrange(5,100)There are numerous types of charts and various types of customizations you can apply to your spreadsheet to make it more attractive.Convert Python Classes to Excel SpreadsheetLet us now learn how to convert the Excel Spreadsheet data to Python classes.  Assume we have a database and we use some Object Relational mapping to map the database into Python classes and then export the objects into spreadsheets: from dataclasses import dataclass from typing import List @dataclass class Sale: id: str  quantity: int @dataclass  class Product:  id: str  name: str  sales:List[Sale] Now, to generate some random data, let’s assume that the above classes are stored in  db_classes.py file then:  1 import random   2    3 # Ignore these for now. You'll use them in a sec ;)   4 from openpyxl import Workbook   5 from openpyxl.chart import LineChart,Reference   6    7 from db_classes import Product,Sale   8    9 products_range = []  10   11 # Let's create 5 products 12 for idx in range(1,6): 13    sales = []  14   15    # Create 5 months of sales  16    for_in range(5): 17        sale_range = Sale(quantity=random.randrange(5,100)) 18        sales.append(sale) 19   20    product = Product(id=str(idx), 21                      name="Product %s" % idx, 22                      sales=sales) 23    products_range.append(product)By running this code, we will get 5 products in 5 months of sale with a random quantity of sales for each month. Now, we have  to convert this into a spreadsheet in which we need to iterate over the data: 25 workbook = Workbook()  26 spreadsheet = workbook.active  27 28 # Append column names first  29 spreadsheet.append(["Product ID","Product Name","Month 1",  30              "Month 2","Month 3","Month 4","Month 5"])  31   32 # Append the data  33 for product in products_range: 34    data = [product.id,product.name] 35    for sale in product.sales: 36        data.append(sale.quantity)  37    spreadsheet.append(data) This will create a spreadsheet with some data coming from your database. How to work with pandas to handle Spreadsheets?We have learned to work with Excel in Python because Excel is one of the most popular tools and finding a way to work with Excel is critical. Pandas is a great tool to work with Excel in Python. It has unique methods to read all kinds of data in an Excel file and we can export items back to Excel using it. To use it, at first we need to install pandas package: $ pip install pandas Then, let’s create a simple DataFrame:  1 import pandas as pd   2    3 data = {   4    "Product Name":["Product 1","Product 2"],   5    "Sales Month 1":[10, 20],   6    "Sales Month 2":[5, 35],   7 }   8 dataframe = pd.DataFrame(data)Now we have some data, and to convert it from a DataFrame into a worksheet we generally use .dataframe_to_rows(): 10 from openpyxl import Workbook 11 from openpyxl.utils.dataframe import  dataframe_to_rows  12   13 workbook = Workbook()  14 spreadsheet = workbook.active  15   16 for row in dataframe_to_rows(df, index=False,header=True):  17    spreadsheet .append(row) 18   19 workbook.save("pandas_spreadsheet.xlsx")We need to use  read_excel method to read data from pandas DataFrame object. excel_file =’movies.xls’  movies=pd.read_excel(excel_file) We can also use Excel file class to use multiple sheets from the same excel file: movies_sheets = [] for sheet in xlsx.sheet_names:     movies_sheets.append(xlsx.parse(sheet))     movies = pd.concat(movies_sheets))Indexes and columns allows you to access data from your DataFrame easily: >>> df.columns  Index(['marketplace', 'customer_id', 'review_id', 'product_id',        'product_parent', 'product_title', 'product_category', 'star_rating',         'helpful_votes', 'total_votes', 'vine', 'verified_purchase',         'review_headline', 'review_body', 'review_date'],        dtype='object') >>> # Get first 10 reviews' star rating  >>> df["star_rating"][:10] R3O9SGZBVQBV76    5 RKH8BNC3L5DLF     5  R2HLE8WKZSU3NL    2  R31U3UH5AZ42LL    5  R2SV659OUJ945Y    4  RA51CP8TR5A2L     5  RB2Q7DLDN6TH6     5  R2RHFJV0UYBK3Y    1  R2Z6JOQ94LFHEP    5  RX27XIIWY5JPB     4  Name: star_rating, dtype: int64 >>> # Grab review with id "R2EQL1V1L6E0C9", using the index >>> df.loc["R2EQL1V1L6E0C9"] marketplace               US customer_id         15305006  review_id     R2EQL1V1L6E0C9  product_id        B004LURNO6  product_parent     892860326  review_headline   Five Stars  review_body          Love it  review_date       2015-08-31  Name: R2EQL1V1L6E0C9, dtype: object Summary In this article we have covered: How to extract information from spreadsheets  How to create Spreadsheets in different ways How to customize a spreadsheet by adding filters, styles, or charts and so on How to use pandas to work with spreadsheets Now you are well aware of the different types of implementations you can perform with spreadsheets using Python. However, if you are willing to gather more information on this topic, you can always rely on the official documentation of openpyxl. To gain more knowledge about Python tips and tricks, check out our Python tutorial. To gain mastery over Python coding,join ourPython certification course.  

How to Work with Excel Spreadsheets using Python

15985
How to Work with Excel Spreadsheets using Python

Excel is considered as one of the most popular and widely used spreadsheet applications developed by Microsoft. You can organize, analyze and store your data into tabular sheets with the help of Excel. From analysts and sales managers, to CEOs, professionals from every field use Excel for creating quick statistics and for data crunching.

Spreadsheets are commonly used in the present world because of their intuitive nature and the ability to handle large datasets. Most importantly, they can work without any prior technical background.

Finding different ways to work with Excel using code is essential since working with data and in Python has some serious advantages in comparison with Excel’s UI. Developers of Python have implemented ways to read, write and manipulate Excel documents.

You can check the quality of your spreadsheet application by going over the checklist below:

  • Is the spreadsheet able to represent static data?
  • Is the spreadsheet able to mix data, calculations, and reports?
  • Is the data in your spreadsheet complete and consistent in nature?
  • Does the spreadsheet have an organized worksheet structure?

This checklist will help you in verifying the qualitative nature of the spreadsheet application you’re going to work on.

Practical Applications  

In this article, we would be using openpyxl to work on data. With the help of this module, you can extract data from a database into an Excel spreadsheet or you can also convert an Excel spreadsheet into a programmatic format. There can be a lot of possible situations where you might feel the need to use a package like openpyxl. Let us discuss a few of them to get a comprehensive overview of it.

Importing New Products Into a Database 

Consider yourself working in an online store company. When they want to add new products to the online store, they make an Excel spreadsheet with a few hundred rows along with the name of the product, description, price and a few more basic information and then they give it to you. 

Now, if you want to import this particular data, you need to iterate over each row of the spreadsheet and then add each of the products into the database of the online store.

Exporting Database Data Into a Spreadsheet

Consider you have a Database table. In this particular table, you have collected information of all your users which includes their name, contact number, email address, and so forth. Now, the Marketing Team is willing to collectively contact all the users and promote a new product of the company. However, neither do they have access to the Database nor they have any idea about using SQL to extract the information. 

In this situation, openpyxl comes to play. You can use it effectively to iterate over each User record and transform the required information into an Excel spreadsheet.    

Appending Information to an Existing Spreadsheet

Consider the same online store example we discussed above. You have an Excel spreadsheet with a list of users and your job is to append to each row the total amount they have spent in your store.

In order to perform this, you have to read the spreadsheet first and then iterate through each row and fetch the total amount spent from the Database. Finally, you need to write it back to the spreadsheet.

Starting openpyxl

You can install the openpyxl package using pip. Open your terminal and write the following command: 

$ pip install openpyxl

After you have installed the spreadsheet, you can make up your own simple spreadsheet: 

from openpyxl import Workbook

workbook = Workbook()
spreadsheet = workbook.active

spreadsheet["A1"] = "Hello"
spreadsheet["B1"] = "World!"

workbook.save(filename="HelloWorld.xlsx")

How to Read Excel Spreadsheets with openpyxl 

Let us start with the most important thing that you can do with a spreadsheet,i.e. read it. We will be using a Watch Sample Dataset which contains a list of 100 watches with information like product name, product ID, review and so forth.  

A Simple Way to Read an Excel Spreadsheet 

Let us start with opening our sample spreadsheet:

>>> from openpyxl import load_workbook
>>> workbook = load_workbook(filename="sample.xlsx")
>>> workbook.sheetnames
['Sheet 1']

>>> spreadsheet = workbook.active
>>> spreadsheet
<Worksheet "Sheet 1">

>>> spreadsheet.title

In the example code above, we open the spreadsheet using load_workbook and then we check all the sheets that are available to work with using workbook.sheetnames. Then Sheet 1 is automatically selected using workbook.active since it is the first sheet available. This is the most common way of opening a spreadsheet.  

Now, let us see the code to retrieve data from the spreadsheet: 

>>> spreadsheet["A1"]
<Cell 'Sheet 1'.A1>

>>> spreadsheet["A1"].value
'marketplace'

>>> spreadsheet["F10"].value
"G-Shock Men's Grey Sport Watch"

You can retrieve the actual value and the cell value  both. To get the actual value, use .value and to get the cell, you can use .cell():

>>> spreadsheet.cell(row=10, column=6)
<Cell 'Sheet 1'.F10>

>>> spreadsheet.cell(row=10, column=6).value
"G-Shock Men's Grey Sport Watch"

Importing Data from a Spreadsheet 

In this section, we will discuss how to iterate through the data, and about conversion into a more useful format using Python.

Let us first start with iterating through the data. There are a number of iterating methods that depend solely on the user.

You can slice the data with a combination of rows and columns:

>>> spreadsheet["A1:C2"]
((<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.C1>), 
 (<Cell 'Sheet 1'.A2>, <Cell 'Sheet 1'.B2>, <Cell 'Sheet 1'.C2>)) 

You can also iterate through the dataset by ranging between rows and columns: 

>>> # Get all cells from column A 
>>> spreadsheet["A"]
(<Cell 'Sheet 1'.A1>, 
 <Cell 'Sheet 1'.A2>, 
 ... 
 <Cell 'Sheet 1'.A99>, 
 <Cell 'Sheet 1'.A100>) 

>>> # Get all cells for a range of columns 
>>> spreadsheet["A:B"] 
((<Cell 'Sheet 1'.A1>, 
  <Cell 'Sheet 1'.A2>, 
  ... 
  <Cell 'Sheet 1'.A99>, 
  <Cell 'Sheet 1'.A100>), 
 (<Cell 'Sheet 1'.B1>, 
  <Cell 'Sheet 1'.B2>, 
  ... 
  <Cell 'Sheet 1'.B99>, 
  <Cell 'Sheet 1'.B100>)) 

>>> # Get all cells from row 5 
>>> spreadsheet[5]
(<Cell 'Sheet 1'.A5>,
 <Cell 'Sheet 1'.B5>,
 ... 
 <Cell 'Sheet 1'.N5>,
 <Cell 'Sheet 1'.O5>)

>>> # Get all cells for a range of rows
>>> spreadsheet[5:6]
((<Cell 'Sheet 1'.A5>, 
  <Cell 'Sheet 1'.B5>, 
  ... 
  <Cell 'Sheet 1'.N5>, 
  <Cell 'Sheet 1'.O5>), 
 (<Cell 'Sheet 1'.A6>, 
  <Cell 'Sheet 1'.B6>, 
  ... 
  <Cell 'Sheet 1'.N6>, 
  <Cell 'Sheet 1'.O6>)) 

Python offers arguments by which you can set limits to the iteration with the help of Python generators like .iter_rows() and .iter_cols()

>>> for row in spreadsheet.iter_rows(min_row=1,
...                           max_row=2,
...                           min_col=1,
...                           max_col=3):
...    print(row)
(<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.C1>) 
(<Cell 'Sheet 1'.A2>, <Cell 'Sheet 1'.B2>, <Cell 'Sheet 1'.C2>)

>>> for column in spreadsheet.iter_cols(min_row=1...                              max_row=2,
...                              min_col=1,
...                              max_col=3):
... print(column)
(<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.A2>) 
(<Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.B2>) 
(<Cell 'Sheet 1'.C1>, <Cell 'Sheet 1'.C2>) 

You can also add Boolean values_only in the above example and set it to True to get the values of cell: 

>>> for value in spreadsheet.iter_rows(min_row=1...                          max_row=2...                          min_col=1...                          max_col=3...                          values_only=True):
...    print(value)
('marketplace', 'customer_id', 'review_id') 
('US', 3653882, 'R3O9SGZBVQBV76')

Since we are now done with iterating the data, let us now manipulate data using Python’s primitive data structures. 

Consider a situation where you want to extract information of a product from the sample spreadsheet and then store it into the dictionary. The key to the dictionary would be the product ID.   

Convert Data into Python classes

To convert data into Python data classes, let us first decide what we want to store and how to store it.  

The two essential elements that can be extracted from the data are as follows:

Convert Data into Python classes

                                                     1. Products                                             2. Review

                                                          • ID                                                         • ID
                                                          • Title                                                     • Customers ID
                                                          • Parent                                                 • Headline
                                                          • Category                                            • Body
                                                                                                                         • Date

Let us implement the two elements: 

import datetime
from dataclasses import dataclass

@dataclass
class Product:
    id: str
    parent: str
    title: str
    category: str
@dataclass
class Review:
    id: str
    customer_id: str
    stars: int
    headline: str
    body: str 
    date: datetime.datetime 

The next step is to create a mapping between columns and the required fields: 

>>> for value in spreadsheet.iter_rows(min_row=1,
...                             max_row=1,
...                             values_only=True):
...    print(value)
('marketplace', 'customer_id', 'review_id', 'product_id', ...)

>>> # Or an alternative
>>> for cell in sheet[1]:
...  print(cell.value)
marketplace
Customer_ID
Review_ID
Product_ID
Product_Parent
...

Finally, let us convert the data into new structures which will parse the data in spreadsheet into a list of products and review objects: 

from datetime import datetime 
from openpyxl import load_workbook 
from classes import Product,Review 
from mapping import PRODUCT_ID,PRODUCT_PARENT,PRODUCT_TITLE, \
    PRODUCT_CATEGORY,REVIEW_DATE,REVIEW_ID,REVIEW_CUSTOMER, \
    REVIEW_STARS,REVIEW_HEADLINE,REVIEW_BODY

# Using the read_only method since you're not gonna be editing the spreadsheet 
workbook = load_workbook(filename="watch_sample.xlsx",read_only=True) 
spreadsheet = workbook.active

products = []
reviews = []

# Using the values_only because you just want to return the cell value
for row in spreadsheet .iter_rows(min_row=2, values_only=True): 
    product = Product(id=row[PRODUCT_ID], 
                      parent=row[PRODUCT_PARENT], 
                      title=row[PRODUCT_TITLE], 
                      category=row[PRODUCT_CATEGORY]) 
    products.append(product)
# You need to parse the date from the spreadsheet into a datetime format
spread_date = row[REVIEW_DATE] 
parsed_date = datetime.strptime(spread_date,"%Y-%m-%d")

review = Review(id=row[REVIEW_ID],
              Customer_ID=row[REVIEW_CUSTOMER],
              stars=row[REVIEW_STARS],
              headline=row[REVIEW_HEADLINE],
              body=row[REVIEW_BODY],
              date=parsed_date)
reviews.append(review)

print(products[0])
print(reviews[0])

After you execute the code, you will get an output that looks like this:

Product(id='A90FALZ1ZC',parent=937111370,...)
Review(id='D3O9OGZVVQBV76',customer_id=3903882,...)

Appending Data 

To understanding how to append data, let us hover back to the first sample spreadsheet. We will open the document and append some data to it: 

from openpyxl import load_workbook

# Start by opening the spreadsheet and selecting the main sheet 
workbook = load_workbook(filename="hello_world.xlsx")
spreadsheet = workbook.active

# Write what you want into a specific cell
spreadsheet["C1"]="Manipulating_Data ;)"

# Save the spreadsheet
workbook.save(filename="hello_world_append.xlsx"

If you open your Excel file, you will notice the additional Manipulating_Data being added to an adjacent cell. 

Writing Excel Spreadsheets With openpyxl 

A spreadsheet is a file that helps to store data in specific rows and columns. We can calculate and store numerical data and also perform computation using formulas. 

So, let’s begin with some simple Spreadsheets and understand what each line means. 

Creating our first simple Spreadsheet

 1 from openpyxl import Workbook
 2  
 3 filename = "first_program.xlsx"
 4  
 5 workbook = Workbook()
 6 spreadsheet = workbook.active
 7  
 8 sheet["A1"] = "first"
 9 sheet["B1"] = "program!"
10  
11 workbook.save(filename=filename)

Line 5: In order to make a Spreadsheet, at first,  we have to create an Empty workbook to perform further operations. 

Lines 8 and 9 : We can add data to a specific cell as per our requirement. In this example, we can see that two values “first” and “program” have been added to specific cells in the sheet. 

Line 11: The line shows how to save data after all the operations we have done. 

Basic Spreadsheet Operations 

Before going to the difficult coding part, at first we have to build our building blocks like how to add and update values, how to manage rows and columns, adding filters, styles or formulas in a Spreadsheet. 

We have already explained the following code by which we can add values to a Spreadsheet: 

>>> spreadsheet["A1"] = "the_value_we_want_to_add"

There is another way that we can add values to Spreadsheet: 

>>> cell = sheet["A1"]
>>> cell
<Cell 'Sheet'.A1>

>>> cell.value
'hello'

>>> cell.value = "hey"
>>> cell.value
'hey'

Line 1: In the first line at first we have declared the cell and updated its value. 

Line 5: We have printed the value of the cell as “first”  because  in the first program we have already assigned sheet["A1"]with “first” 

Line 8 : We have updated the value of the cell as "second"by simply assigning it to cell.value

Lines 9 : In this line, we have just printed the updated value of cell. 

Finally, you have to save all the operations you have performed into the spreadsheet once you call workbook.save().

If  the cell didn’t exist while adding a value ,then openpyxl creates a cell:

>>> # Before, our spreadsheet has only 1 row
>>> print_rows()
('first', 'program!')

>>> # Try adding a value to row 10
>>> spreadsheet["B10"] = "test"
>>> print_rows()
('first', 'program!') 
(None, None) 
(None, None) 
(None, None) 
(None, None) 
(None, None) 
(None, None) 
(None, None) 
(None, None) 
(None, 'test') 

Managing Rows and Columns in Spreadsheet 

Insertion or deletion of rows (adding or removing elements of rows /columns) is one of the most basic operations in Spreadsheet. In openpyxl.We can perform these operations by simply calling these methods and passing its arguments. 

  • .insert_rows()
  • .delete_rows()
  • .insert_cols()
  • .delete_cols()

We can pass 2 types of arguments to the methods :  

  1. idx 
  2. amount 

Idx stands for index position and amount refers to the number of values we can store in the Spreadsheet. 

Using our basic knowledge based on the first  simple program, let’s see how we can use these methods inside the program: 

>>> print_rows()
('first', 'program!')

>>> # Insert a column at the first position before column 1 ("A")
>>> spreadsheet.insert_cols(idx=1)
>>> print_rows()
(None, 'first', 'program!')

>>> # Insert 5 columns in  between column 2 ("B") and 3 ("C")
>>> spreadsheet.insert_cols(idx=3,amount=5)
>>> print_rows()
(None, 'first', None, None, None, None, None, 'program!')

>>> # Delete the created columns
>>> spreadsheet.delete_cols(idx=3,amount=5)
>>> v.delete_cols(idx=1)
>>> print_rows()
('first', 'program!')

>>> # Insert a new row in the beginning
>>> spreadsheet.insert_rows(idx=1)
>>> print_rows()
(None, None)
('first', 'program!')

>>> # Insert 3 new rows in the beginning 
>>> spreadsheet.insert_rows(idx=1,amount=3)
>>> print_rows()
(None, None) 
(None, None) 
(None, None) 
(None, None) 
('first', 'program!')

>>> # Delete the first 4 rows 
>>> spreadsheet.delete_rows(idx=1,amount=4>>> print_rows() 
('first', 'program!') 

Managing Sheets

We have seen the following recurring piece of code in our previous examples .This is one of the ways of selecting the default sheet from the Spreadsheet: 

spreadsheet = workbook.active

However, if you want to open a spreadsheet with multiple sheets, you can write the following command: 

>>> # Let's say you have two sheets: "Products" and "Company Sales"
>>> workbook.sheetnames
['Products', 'Company Sales']

>>> # You can select a sheet using its title
>>> Products_Sheet = workbook["Products"]
>>> Sales_sheet = workbook["Company Sales"]

If we want to change the title of the Sheet, execute the following code: 

>>> workbook.sheetnames
['Products', 'Company Sales']

>>> Products_Sheet = workbook["Products"]
>>> Products_Sheet.title = "New Products"

>>> workbook.sheetnames
['New Products', 'Company Sales']

We can CREATE / DELETE Sheets also with the help of two methods - .create_sheet() and  .remove()

>>> #To print the available sheet names
>>> workbook.sheetnames 
['Products', 'Company Sales']

>>> #To create a new Sheet named "Operations"
>>> Operations_Sheet = workbook.create_sheet("Operations")

>>> #To print the updated available sheet names
>>> workbook.sheetnames
['Products', 'Company Sales', 'Operations']

>>> # To define the position Where we want to create the Sheet(here “HR” sheet is created at the first position .Here index 0 represents the first position)
>>> HR_Sheet = workbook.create_sheet("HR",0)

>>> #To again  print the updated available sheet names
>>> workbook.sheetnames
['HR', 'Products', 'Company Sales', 'Operations']

>>> # To remove them,we just have to send the sheet names as an argument which we want to delete to the method  .remove() 
>>> workbook.remove(Operations_Sheet)
>>> workbook.sheetnames
['HR', 'Products', 'Company Sales']

>>> #To delete hr_sheet
>>> workbook.remove(hr_sheet)
>>> workbook.sheetnames
['Products', 'Company Sales']

Adding Filters to the Spreadsheet 

We can use openpyxl to add filters in our Spreadsheet but when we open our Spreadsheet, the data won’t be rearranged according to these sorts and filters. 

When you’re programmatically creating a spreadsheet and it is going to be sent and used by someone else, it is a good practice to add different filters and allow people to use it afterward. 

In the code below there is a simple example which shows how to add a simple filter to your spreadsheet: 

>>> # Check the used spreadsheet space using the attribute "dimensions"
>>> spreadsheet.dimensions
'A1:O100'

>>> spreadsheet.auto_filter.ref="A1:O100"
>>> workbook.save(filename="watch_sample_with_filters.xlsx")

Adding Formulas to the Spreadsheet 

Formulas are one of the most commonly used and powerful features of spreadsheets. By using formulas, you can solve various mathematical equations with the additional support of openpyxl which makes those calculations as simple as editing a specific cell’s value.

The list of formulas supported by openpyxl are:

>>> from openpyxl.utils import FORMULAE
>>> FORMULAE
frozenset({'ABS',
           'AMORLINC',
           'ACCRINT', 
           'ACOS', 
           'ACCRINTM', 
           'ACOSH', 
            ...,       
           'AND', 
           'YEARFRAC', 
           'YIELDDISC', 
           'AMORDEGRC', 
           'YIELDMAT', 
           'YIELD', 
           'ZTEST'}) 

Let’s add some formulas to our spreadsheet. 

Let’s check the average star rating of  the 99 reviews within the spreadsheet: 

>>> # Star rating is in column "H" 
>>> spreadsheet["P2"] = "=AVERAGE(H2:H100)"
>>> workbook.save(filename = "first_example.xlsx")

Now, if we open your spreadsheet and go to cell P2, you can see the value to be 4.18181818181818.  

Similarly, we can use this methodology to include any formulas for our requirements in our spreadsheet. For example, if we want to count the number of helpful reviews: 

>>> # The helpful votes  counted in column "I" 
>>> spreadsheet["P3"] = '=COUNTIF(I2:I100, ">0")' 
>>> workbook.save(filename = "first_example.xlsx") 

Adding Styles to the Spreadsheet

It is not so important and usually, we don’t use this in everyday code but for the sake of completeness, we will also understand this with the following example.

Using openpyxl, we get multiple styling options such as including fonts, colors,  borders,and so on.

Let’s have a look at an example:

>>> # Import necessary style classes
>>> from openpyxl.styles import Font,Color,Alignment,Border,Side,colors

>>> # Create a few styles
>>> Bold_Font = Font(bold=True)
>>> Big_Red_Text = Font(color=colors.RED,size=20>>> Center_Aligned_Text = Alignment(horizontal="center">>> Double_Border_Side = Side(border_style="double">>> Square_Border = Border(top=double_border_side, 
...                       right=double_border_side, 
...                       bottom=double_border_side, 
...                       left=double_border_side) 

>>> # Style some cells!
>>> spreadsheet["A2"].font = Bold_Font
>>> spreadsheet["A3"].font = Big_Red_Text
>>> spreadsheet["A4"].alignment = Center_Aligned_Text
>>> spreadsheet["A5"].border = Square_Border
>>> workbook.save(filename="sample_styles.xlsx") 

If you want to apply multiple styles to one or several cells in our spreadsheets,you can use  NamedStyle class: 

>>> from openpyxl.styles import NamedStyle

>>> # Let's create a style template for the header row
>>> header = NamedStyle(name="header")
>>> header.font = Font(bold=True)
>>> header.border = Border(bottom=Side(border_style="thin"))
>>> header.alignment = Alignment(horizontal="center",vertical="center")

>>> # Now let's apply this to all first row (header) cells
>>> header_row = sheet[1]
>>> for cell in header_row:
... cell.style = header

>>> workbook.save(filename="sample_styles.xlsx") 

Adding Charts to our Spreadsheet

Charts are a good way to compute and understand large amounts of data quickly and easily. We have a lot of charts such as bar chart, pie chart, line chart, and so on. 

Let us start by creating a new workbook with some data: 

 1 from openpyxl import Workbook 
 2 from openpyxl.chart import BarChart,Reference 
 3  
 4 workbook = Workbook() 
 5 spreadsheet = workbook.active 
 6  
 7 # Let's create some sample sales data 
 8 rows = [ 
 9    ["Product","Online","Store"], 
10    [1,30,45], 
11    [2,40,30], 
12    [3,40,25], 
13    [4,50,30], 
14    [5,30,25], 
15    [6,25,35], 
16    [7,20,40], 
17 ] 
18  
19 for row in rows:
20    spreadsheet .append(row)

Now let us create a bar chart that will show the total number of sales per product: 

22 chart = BarChart()
23 data = Reference(worksheet=sheet,
24                 min_row=1, 
25                 max_row=8, 
26                 min_col=2, 
27                 max_col=3) 
28  
29 chart.add_data(data,titles_from_data=True)
30 spreadsheet .add_chart(chart, "E2")
31
32 workbook.save("chart.xlsx")

You can also create a line chart by simply making some changes to the data: 

 1 import random 
 2 from openpyxl import Workbook 
 3 from openpyxl.chart import LineChart,Reference 
 4  
 5 workbook = Workbook()
 6 sheet = workbook.active
 7  
 8 # Let's create some sample sales data 
 9 rows= [
10    ["", "January", "February", "March", "April", 
11    "May", "June", "July", "August", "September", 
12     "October", "November", "December"], 
13    [1, ],
14    [2, ],
15    [3, ],
16 ] 
17  
18 for row in rows: 
19    sheet.append(row)
20  
21 for row in sheet.iter_rows(min_row=2,
22                           max_row=4,
23                           min_col=2,
24                           max_col=13):
25    for cell in row:
26        cell.value = random.randrange(5,100)

There are numerous types of charts and various types of customizations you can apply to your spreadsheet to make it more attractive.

Convert Python Classes to Excel Spreadsheet

Let us now learn how to convert the Excel Spreadsheet data to Python classes.  

Assume we have a database and we use some Object Relational mapping to map the database into Python classes and then export the objects into spreadsheets: 

from dataclasses import dataclass
from typing import List

@dataclass
class Sale:
    id: str 
    quantity: int

@dataclass 
class Product: 
    id: str 
    name: str 
    sales:List[Sale] 

Now, to generate some random data, let’s assume that the above classes are stored in  db_classes.py file then: 

 1 import random 
 2  
 3 # Ignore these for now. You'll use them in a sec ;) 
 4 from openpyxl import Workbook 
 5 from openpyxl.chart import LineChart,Reference 
 6  
 7 from db_classes import Product,Sale 
 8  
 9 products_range = [] 
10  
11 # Let's create 5 products
12 for idx in range(1,6):
13    sales = [] 
14  
15    # Create 5 months of sales 
16    for_in range(5):
17        sale_range = Sale(quantity=random.randrange(5,100))
18        sales.append(sale)
19  
20    product = Product(id=str(idx),
21                      name="Product %s" % idx,
22                      sales=sales)
23    products_range.append(product)

By running this code, we will get 5 products in 5 months of sale with a random quantity of sales for each month. 

Now, we have  to convert this into a spreadsheet in which we need to iterate over the data: 

25 workbook = Workbook() 
26 spreadsheet = workbook.active 
27
28 # Append column names first 
29 spreadsheet.append(["Product ID","Product Name","Month 1", 
30              "Month 2","Month 3","Month 4","Month 5"]) 
31  
32 # Append the data 
33 for product in products_range:
34    data = [product.id,product.name]
35    for sale in product.sales:
36        data.append(sale.quantity) 
37    spreadsheet.append(data) 

This will create a spreadsheet with some data coming from your database. 

How to work with pandas to handle Spreadsheets?

How to work with pandas to handle Spreadsheets

We have learned to work with Excel in Python because Excel is one of the most popular tools and finding a way to work with Excel is critical. Pandas is a great tool to work with Excel in Python. It has unique methods to read all kinds of data in an Excel file and we can export items back to Excel using it. 

To use it, at first we need to install pandas package: 

$ pip install pandas 

Then, let’s create a simple DataFrame: 

 1 import pandas as pd 
 2  
 3 data = { 
 4    "Product Name":["Product 1","Product 2"], 
 5    "Sales Month 1":[10, 20], 
 6    "Sales Month 2":[5, 35], 
 7 } 
 8 dataframe = pd.DataFrame(data)

Now we have some data, and to convert it from a DataFrame into a worksheet we generally use .dataframe_to_rows()

10 from openpyxl import Workbook
11 from openpyxl.utils.dataframe import  dataframe_to_rows 
12  
13 workbook = Workbook() 
14 spreadsheet = workbook.active 
15  
16 for row in dataframe_to_rows(df, index=False,header=True): 
17    spreadsheet .append(row)
18  
19 workbook.save("pandas_spreadsheet.xlsx")

We need to use  read_excel method to read data from pandas DataFrame object. 

excel_file =’movies.xls’ 
movies=pd.read_excel(excel_file) 

We can also use Excel file class to use multiple sheets from the same excel file: 

movies_sheets = []
for sheet in xlsx.sheet_names:
    movies_sheets.append(xlsx.parse(sheet))
    movies = pd.concat(movies_sheets))

Indexes and columns allows you to access data from your DataFrame easily: 

>>> df.columns 
Index(['marketplace', 'customer_id', 'review_id', 'product_id', 
       'product_parent', 'product_title', 'product_category', 'star_rating', 
       'helpful_votes', 'total_votes', 'vine', 'verified_purchase', 
       'review_headline', 'review_body', 'review_date'], 
      dtype='object') 
>>> # Get first 10 reviews' star rating 
>>> df["star_rating"][:10]
R3O9SGZBVQBV76    5 
RKH8BNC3L5DLF     5 
R2HLE8WKZSU3NL    2 
R31U3UH5AZ42LL    5 
R2SV659OUJ945Y    4 
RA51CP8TR5A2L     5 
RB2Q7DLDN6TH6     5 
R2RHFJV0UYBK3Y    1 
R2Z6JOQ94LFHEP    5 
RX27XIIWY5JPB     4 
Name: star_rating, dtype: int64

>>> # Grab review with id "R2EQL1V1L6E0C9", using the index
>>> df.loc["R2EQL1V1L6E0C9"]
marketplace               US 
customer_id         15305006 
review_id     R2EQL1V1L6E0C9 
product_id        B004LURNO6 
product_parent     892860326 
review_headline   Five Stars 
review_body          Love it 
review_date       2015-08-31 
Name: R2EQL1V1L6E0C9, dtype: object 

Summary 

In this article we have covered: 

  • How to extract information from spreadsheets  
  • How to create Spreadsheets in different ways 
  • How to customize a spreadsheet by adding filters, styles, or charts and so on 
  • How to use pandas to work with spreadsheets 

Now you are well aware of the different types of implementations you can perform with spreadsheets using Python. However, if you are willing to gather more information on this topic, you can always rely on the official documentation of openpyxl. To gain more knowledge about Python tips and tricks, check out our Python tutorial. To gain mastery over Python coding,join ourPython certification course

 

Priyankur

Priyankur Sarkar

Data Science Enthusiast

Priyankur Sarkar loves to play with data and get insightful results out of it, then turn those data insights and results in business growth. He is an electronics engineer with a versatile experience as an individual contributor and leading teams, and has actively worked towards building Machine Learning capabilities for organizations.

Join the Discussion

Your email address will not be published. Required fields are marked *

Suggested Blogs

Top-Paying Programming Certifications for 2021

Programming is at the core of software development, which is why there is a huge demand for programmers—a demand that is growing exponentially and is expected to rise at a steady rate even in the future. In today’s world, just about everything is getting automated and digitization has become the new normal.Recruiters are on the lookout for professionals who have solid programming and full-stack development skills. Every recruiting agency and organizational HR recruiting team has put in place a thorough screening process, and this active hiring in startups, SMEs, and multinational companies has raised the bar for many aspiring programmers. Having a software development, web development, or programming certification will give you an upper hand at the time of recruitment. A certification from a reputed accreditation body will validate your skills and make you stand out among your peers.Having an extra certification apart from your UG or PG degree makes you a better fit for the job role in which you have an interest. But before you opt for any certification, you need to understand which programming language will take you where; and the potential benefits of pursuing a certification course of that particular programming language.In this article, you will get to know about the top programming certifications of 2021 and how to achieve them.What are Programming certifications?Programming certifications are exam-oriented, and verify your skill and expertise in that field. Different organizations provide different programming certification exams that define your level, skills, and abilities vis `a vis that programming language. Having a programming certification will give you an edge over other peers and will highlight your coding skills.Most Popular Programming CertificationsC & C++ CertificationsOracle Certified Associate Java Programmer OCAJPCertified Associate in Python Programming (PCAP)MongoDB Certified Developer Associate ExamR Programming CertificationOracle MySQL Database Administration Training and Certification (CMDBA)CCA Spark and Hadoop Developer1. C & C++ CertificationsRegardless of your specialization in your UG or PG courses, clearing a developer-rated certification will not only make your resume stand out from others but also enhance your skills and boost your confidence. We have curated the top-most and popularly available certifications with descriptions that can help you decide which one is appropriate for your career path.C & C++ Certifications: C and C++ are often called the mother of Procedure-oriented and Object-oriented programming languages, which is absolutely true. These two programming languages have been around for many decades. Colleges and universities all over the world teach these as the base language. To get global recognition for the C and C++ certification, C++ Institute and Pearson VUE decided to carve a niche in this part of the certification landscape by offering the world's first international C/C++ certifications.Aspirants and professionals can choose either C or C++ as a career option/path and climb the certification ladder from associate to professional to senior. The C Programming Language Certified Associate (CLA) and C++ Certified Associate Programmer (CPA) are the core and first-level C and C++ certifications.CLA comprises of topics likeIntroduction to compiling and software development;Basic scalar data types and their operators;Flow control;Complex data types: arrays, structures and pointers;Memory management;Files and streams;Structuring the code: functions and modules;Preprocessor directives and complex declarations.CPA comprises of topics likeIntroduction to compiling and software development;Basic scalar data types, operators, flow control, streamed input/output, conversions;Declaring, defining and invoking functions, function overloading;Data aggregates;String processing, exceptions handling, dealing with namespaces;Object-oriented approach and its vocabulary;Dealing with classes and objects, class hierarchy and inheritance;Defining overloaded operators, user-defined operators, exceptions;Demand and Benefits: Having a CLA certification verifies that the programmer or the aspirant has an understanding of all the necessary and essential universal concepts of computer programming and developer tools. The course also covers all the syntax and semantics of different C constructs plus the data types offered by the language. This course brings crisp knowledge on writing programs using standard language infrastructure regardless of the hardware or software platform.A C++ Certified Associate Programmer (CPA) certification will give you an upper hand because it comprises syntax and semantics of the C++ language plus basic C++ data types. Apart from that, it contains principles of the object-oriented model and C++ implementation. Also, you will get to know about the various C++ standard libraries through this certification process. The average entry-level salary of a C/C++ developer with this certification will be $ 7,415 per annum. With two to three years of experience, the average salary hikes to $ 10,593 annually.Top companies and industries hiring CLA and CPA are Philips, Calsoft Pvt. Ltd., Cognizant, Synopsys Inc., private universities, Mphasis, etc.Where to take Training for Certification: CPP Institute has all the study resources you need to prepare for this examination. Apart from that, you can study from YouTube free resources.Who should take the Training (roles) for Certification: Any programmer or computer science aspirant - who wants to expand their knowledge of C/C++ or start their career as a C/C++ programmer or developer can opt for this certification course. There is no other prerequisite to appear for this exam.Course fees for Certification:CLA Certification: $ 147.50 (50% discount voucher)CPA Certification: $ 147.50 (50% discount voucher)Exam fee for certification:CLA Certification: $ 295CPA Certification: $ 295Retake fee for certification: Aspirants who have paid the complete exam price (USD 295) or have completed a course aligned with certification in the self-study mode (50% discount voucher) can have a free retake of the CPA or CLA exam. There is no limit to the number of times a candidate may retake the exam. You must wait 15 days before being allowed to re-sit that exam.2. Oracle Certified Associate Java Programmer OCAJPThis is a Java programming certification provided by Oracle. Java is among the most popular programming languages. James Gosling is the creator of Java which was earlier named Oak. It is a robust, high-level, general-purpose, pure object-oriented programming language developed by Sun Microsystems (now part of Oracle). Java consistently tops the 'most used programming languages’ list and is one of the most extensively used software development platforms. If you have the plan to get a proper training course online before appearing for the certification exam, KnowledgeHut (https://www.knowledgehut.com/programming/java-training) has that for you.It is the preliminary and most basic certification provided by Oracle for Java. It helps gain fundamental understanding of Java programming and builds a foundation in Java and other general programming concepts. The certification encompasses two subcategories –OCAJP Java Standard Edition 8 (OCAJP 8) and  OCAJP Java Standard Edition 11 (OCAJP 11)It comprises of topics likeJava BasicsWorking with Java Data TypesUsing Operators and Decision ConstructsCreating and Using ArraysUsing Loop ConstructsWorking with Methods and EncapsulationWorking with InheritanceHandling ExceptionsClass Methods and EncapsulationDescribing and Using Objects and ClassesHandling ExceptionsJava Technology and the Java Development EnvironmentInheritance and InterfacesUnderstanding ModulesUsing Operators and Decision ConstructsWorking with Java ArraysWorking with Selected classes, Java Primitive Data Types and String APIsDemand and Benefits: Having an OCAJP certification verifies that the aspirant has all the necessary and essential skills to become an expert Java developer. This certification also helps in getting an internship or entry-level jobs in different organizations. The entry-level salary of a junior Java developer with this certification is $ 3670 per annum; when the candidate gathers two to three years of experience, the average salary hikes to $ 5430 annually.Top companies and industries hiring Oracle Certified Associate Java Programmers are Smart Monitor Pvt. Ltd., Fiserv, Micron Semiconductor Asia Pvt. Ltd., private universities and many others.Where to take Training for Certification: KnowledgeHut has a fascinating course opportunity for beginners in Java programming. It has workshops with hands-on learning and 40 hours of instructor-led online lectures. Apart from that, Oracle also provides exam vouchers for this certification course.Who should take the Training (roles) for Certification: Any programmer or computer science aspirant - who wants to settle as a Java developer or start his/her career as a Java programmer can opt for this certification course. There is no other prerequisite to appear for this exam.Course fee for Certification: $ 245Application fee for certification:OCAJP8: $ 245OCAJP11: $ 249Exam fee for certification:OCAJP8: $ 245OCAJP11: $ 255Retake fee for certification: Aspirants can retake the exam if the exam voucher has a free retake option. If the exam retake option is available, one can opt for the exam after 14 days.3. Certified Associate in Python Programming (PCAP)Python is an interpreted, general-purpose, and high-level programming language developed by Guido Van Rossum. Python released in 1991 and within 5 to 6 years, this programming language become the most popular and widely used programming language in various disciplines. Today, companies use Python for GUI and CLI-based software development, web development (server-side), data science, machine learning, AI, robotics, drone systems, developing cyber-security tools, mathematics, system scripting, etc. PCAP is a professional Python certification credential that measures your competency in using the Python language to create code and your fundamental understanding of object-oriented programming.It comprises of topics likeBasic concepts of PythonOperators & data typesControl and EvaluationsModules and PackagesData AggregatesException HandlingStringsFunctions and ModulesObject-Oriented ProgrammingList Comprehensions, Lambdas, Closures, and I/O OperationsClasses, Objects, and ExceptionsDemand and Benefits: Having a Python certification verifies that the programmer or the aspirant has all the necessary and essential skills needed to become an expert Python developer. This certification also helps in getting an internship or entry-level jobs in different organizations. The average entry-level salary of a Python developer starts at around $100k per annum. With a few years of experience, the average salary hikes to $ 105k annually.Top companies and organizations hiring certified Python programmers are Bank of America, Atlassian, Google, Adobe, Apple, Cisco Systems, Intel, Lyft, IBM, etc.Where to take Training for Certification: KnowledgeHut has a fascinating course opportunity for beginners in Python programming. It has hands-on learning with 24 hours of instructor-led online lectures. Apart from that, the course has 100 hours of MCQs and three live projects.Who should take the Training (roles) for Certification: Any programmer, graduate, post graduate student, or computer science aspirant - who wants to pursue a career as a Python developer or  Python programmer can opt for this certification training. There is no other prerequisite to appear for this exam.Course fees for Certification:  $ 295Exam fee for certification: $ 295Retake fee for certification: If a candidate fails the exam, he/she has to wait for 15 days before being allowed to retake the exam for free. There is no limit to the number of times a candidate may retake an exam.4. MongoDB Certified Developer Associate ExamMongoDB is a NoSQL, document-based high-volume heterogeneous database system. Instead of having tables with rows and columns, MongoDB uses a collection of documents. It is a database development system that provides scalability and flexibility as per query requirements. Its document models are easy to implement for developers and can meet complex demands at scale.MongoDB created this MongoDB Certified Developer Associate Exam for individuals who require to verify their knowledge on fundamentals of designing and building applications using MongoDB. They recommend this certification for those who want to become software engineers and have a solid understanding of core MongoDB along with professional experience.It comprises of topics likeMongoDB BasicsCRUDIndexing and PerformanceThe MongoDB Aggregation FrameworkBasic Cluster AdministrationAggregation & ReplicationShardingMongoDB Performance  MongoDB for Python DevelopersMongoDB for Java Developers or MongoDB for JavaScript DevelopersData ModelingDemand and Benefits: Having a MongoDB Certified Developer Associate Exam certification verifies that the programmer or the aspirant has all the necessary and essential skills to become a NoSQL database expert. The MongoDB certification is inexpensive and in demand. The average salary for a software developer with MongoDB skills starts from $ 8200 per annum.Top companies and organizations hiring certified MongoDB developers are Accenture, Collabera, Leoforce LLC., Adobe, Trigent Software, Lyft, etc.Where to take Training for Certification: KnowledgeHut has a comprehensive course structure for those who want to learn MongoDB & Mongodb Administrator. It has 24+ hours of instructor-led online lectures and 80+ hours of hands-on with cloud labs. This self-paced course also includes capstone projects to give participants a feel of real world working.  Who should take the Training (roles) for Certification: Any programmer, graduate, post graduate student, experienced developer or computer science aspirant - who wants to embark on a career as a MongoDB developer or start his/her career as a NoSQL database expert or do better in their current role as a MongoDB developer can opt for this certification course. There is no other prerequisite to appear for this exam.Course fees for Certification:  $ 150Exam fee for certification: $ 150Retake fee for certification: MongoDB University is no longer allowing a free retake with the exam fee. The candidate has to pay an additional $10 to reschedule or retake the exam.5. R Programming CertificationIt is a part of the data science specialization from Johns Hopkins University under Coursera. This course teaches R programming for efficient data analysis. It covers different R programming concepts like building blocks of R, datatypes, reading data into R from external files, accessing packages, writing functions, debugging techniques, profiling R code, and performing analysis.It comprises of topics like:Basic building blocks in RData types in RControl StructuresScoping Rules - OptimizationCoding StandardsDates and TimesFunctionsLoopingDebugging toolsSimulating data in RR ProfilerDemand and Benefits: Having an R Programming certification verifies that the programmer or the aspirant has all the necessary and essential skills require to get a job role as data analyst. This certification also helps in getting an internship or entry-level jobs in different organizations and firms. The average salary of a certified R programmer with this certification is ₹ 508,224 per annum.Top companies and industries hiring certified R programmers are Technovatrix, CGI Group Inc., Amazon, Sparx IT Solutions, Accenture, Uber, etc.Where to take Training for Certification: KnowledgeHut has a fascinating training course for those who wants to become a R programmer. It has 22+ hours of instructor-led live training and three self-paced live projects.Who should take the Training (roles) for Certification: Any data analyst, graduate, post graduate student, experienced data analyst or computer science aspirant - who wants to settle as a R programmer or data analyst can opt for this certification course. There is no other prerequisite to appear for this exam. Course fees for Certification: FreeFee for certification: $ 60 (Coursera Plus Monthly)Retake fee for certification: Free6. Oracle MySQL Database Administration Training and Certification (CMDBA)It is another course offered by Oracle for SQL developers. Oracle University designed this course for database administrators who want to validate their skills with developing performance, blending business processes, and accomplishing data processing work. Structured Query Language (SQL) is one of the top database management query languages that allows us to access and manipulate databases. If you want to verify your database skills during a job interview or impress your peers at your workplace then this certification is worth getting. This certification path includes Professional, Specialist, and Developer levels. The candidate should pass the MySQL Database Administrator Certified Professional Exam Part 1 & Part 2 to earn the certification.It comprises of topics likeInstalling MySQLMySQL ArchitectureConfiguring MySQLUser ManagementMySQL SecurityMaintaining a Stable SystemOptimizing Query PerformanceBackup StrategiesConfiguring a Replication TopologyDemand and Benefits: Having an CMDBA certification verifies that the programmer or the aspirant has all the necessary and essential skills required to get a job role as SQL developer. This certification also helps in getting an internship or entry-level jobs in different organizations and firms. The average salary of a certified MySQL DBA or backend developer with this certification is $ 66,470 per annum.Top companies and industries hiring Certified MySQL database administrators are Fiserv, IBM, HCL, Adobe, Microsoft, Apple, Accenture, Collabera, and more.Where to take Training for Certification: KnowledgeHut has a cutting-edge curriculum for those who want to become  MySQL database administrators. It has 16+ hours of instructor-led online lectures and 80+ hours of hands-on lab. Apart from that, this self-paced course has Capstone projects.Who should take the Training (roles) for Certification: Any developer, graduate, post graduate student, experienced developer or computer science aspirant - who wants to pursue a career as a DBA or backend developer or start his/her career in database management or backend software development can opt for this certification course. There is no other prerequisite to appear for this exam or course.Course fees for Certification: $ 255Exam fee for certification: $ 255Retake fee for certification: Aspirants can retake the exam if the exam voucher has a free retake option. If the exam retake option is available, one can opt for the exam after 14 days after the initial attempt.7. CCA Spark and Hadoop DeveloperWith the exponential growth in data, IT firms and organizations have to manage this tremendous amount of data generated. So, many companies are actively looking for Big data and Spark developers who can optimize performance. Big Data is the term used to describe enormous volumes of data. Apache Spark supports data management as it is an open-source centralized analytics engine that handles large-scale data processing.It requires prerequisite knowledge of Scala and Python. This certification also verifies and showcases your skills through Spark and Hadoop projects. Passing this certification course gives you a logo and a license to authenticate your CCA status.It comprises of topics likeLoad data from HDFS for use in Spark applicationsWrite the results back into HDFS using SparkRead and write files in a variety of file formatsPerform standard extract, transform, load (ETL) processes on data using the Spark APIUse metastore tables as an input source or an output sink for Spark applicationsUnderstand the fundamentals of querying datasets in SparkFilter data using SparkWrite queries that calculate aggregate statisticsJoin disparate datasets using SparkProduce ranked or sorted dataSupply command-line options to change your application configuration, such as increasing available memoryDemand and Benefits: Passing the CCA Spark and Hadoop Developer Exam (CCA175) by Cloudera verifies that you have all the essential skills required to get a job as a Hadoop developer and handle Big data projects. The average salary of a certified CCA Spark and Hadoop Developer with this certification is $ 74,200 per annum.Top companies and industries hiring Certified Spark and Hadoop Developers are Primus Global, IBM, Collabera, CorroHealth, Genpact, Xerox, Accenture, and more.Where to take Training for Certification: KnowledgeHut has extensive courses for those who want to become Big Data experts and want to work as Hadoop developers. It has different courses on Big Data Analytics, Apache Storm, Hadoop Administration, Apache Spark & Scala, Big Data with Hadoop, and more.Who should take the Training (roles) for Certification: Any Big Data developer, graduate & post graduate students, Hadoop developer or computer science aspirant - who wants to make a career in Big data development or start his/her career as a Big Data or Hadoop project developer can opt for this certification course. There is no other prerequisite to appear for this exam.Course fees for Certification: $ 295Application fee for certification: $ 295Exam fee for certification: $ 295Retake fee for certification: Within 30 to 60 minutes of exam completion, Cloudera will send a scorecard mail with a pass or fail status. If the candidate fails the exam, then they have to wait for 30 days for another try.  Cloudera gives additional discounts on retakes.ConclusionWhether you are starting your career as a coder or are an experienced programmer looking to grow in the industry, having a certification and proper knowledge of any popular programming language is one of the most proven ways to elevate your programming career.  We trust that this article will help you to understand your area of interest. Choose the programming language you wish to make a career in, wisely. This would also depend on your pre-existing knowledge. If you aren't sure which resource will be more informative for doing your certification as per your area of interest, KnowledgeHut (https://www.knowledgehut.com/) has all the support and expert trainers who can guide you, from start to finish—that is in clearing the exam and helping you gain sound knowledge of your preferred subject.Receiving a programming certification is an added bonus which will make you stand out from the rest. Proper training from an institute such as KnowledgeHut will help you gain skills that are relevant and in demand in the industry.
3483
Top-Paying Programming Certifications for 2021

Programming is at the core of software development... Read More

Top IT Certifications for Java Developers in 2021

Programming languages are at the heart of computer science and software development. They help developers write efficient code for developing digital solutions through applications and websites. Programming helps in automating, maintaining, assembling, and measuring the processed data.  Java is one such popular programming language. It is a robust, high-level, general-purpose, pure object-oriented programming language developed by Sun Microsystems (now part of Oracle). James Gosling is the creator of Java which was earlier named Oak. Java ranks high in the top programming languages list and is one of the most extensively used software development platforms. It is well suited to developing software solutions and other innovative projects and simulations.  Since Oracle acquired Sun Microsystems in January 2010, they have been responsible for the further development of the Java platform. All the mentioned top Java certifications verify a specific expertise level and knowledge of the Java platform highlighting particular domains. Without further due, let us now dig into the top 5 Java certifications and their details. About Oracle’s Java CertificationsOrganizations and industries consider certifications as proof of knowledge, especially when the certifications are from a recognized body or firm. Aspirants and professionals looking for possibilities in the Java development domain can avail of a plethora of benefits through the certifications mentioned in this article. There are six levels of Oracle Java Certification based on job roles, skills, and responsibilities: Oracle Certified Junior Associate (OCJA) Oracle Certified Associate (OCA) Oracle Certified Professional (OCP) Oracle Certified Specialist (OCS) Oracle Certified Expert (OCE) Oracle Certified Master (OCM) Among them, the top five Java certifications that are in demand for the year 2021 are – 1. Oracle Certified Associate Java Programmer OCAJPIt is the preliminary and most basic certification provided by Oracle for Java. It helps you gain fundamental understanding of Java programming and build a foundation in Java and other general programming concepts. There are two subcategories in this certification – OCAJP Java Standard Edition 8 (OCAJP 8) and  OCAJP Java Standard Edition 11 (OCAJP 11) OCAJP8 comprises of topics like  Creating and Using Arrays Handling Exceptions Java Basics Using Loop Constructs Using Operators and Decision Constructs Working with Inheritance Working with Java Data Types Working with Methods and Encapsulation Working with Selected classes from the Java API OCAJP11 comprises of topics like Applying Encapsulation Creating and Using Methods Creating Simple Java Programs Describing and Using Objects and Classes Handling Exceptions Java Technology and the Java Development Environment Programming Abstractly Through Interfaces Reusing Implementations Through Inheritance Understanding Modules Using Operators and Decision Constructs Working with Java Arrays Working with Java Primitive Data Types and String APIs Demand and Benefits: Having an OCAJP certification verifies that the programmer or the aspirant has all the necessary and essential skills to become an expert Java developer. This certification also helps in getting an internship or entry-level jobs in different organizations. The entry-level salary of a junior Java developer with this certification is $ 3670 per annum; when the candidate gathers two to three years of experience, the average salary hikes to $ 5430 annually.   (Source: Glassdoor) Top companies and industries hiring Oracle Certified Associate Java Programmers are Smart Monitor Pvt. Ltd., Fiserv, Micron Semiconductor Asia Pvt. Ltd., and more. Where to take Training for Certification: KnowledgeHut has a fascinating course, designed for beginners in Java programming. It offers hands-on learning with 40 hours of instructor-led online lectures. Apart from that, Oracle also provides exam vouchers for this certification course. Who should take the Training (roles) for Certification: Any programmer or computer science aspirant - who wants to be a Java developer or start his/her career as a Java programmer can opt for this certification course. There is no other prerequisite to appear for this exam. Course fees for Certification:  $ 245 Application fee for certification: $ 245 Exam fee for certification: $ 245 Retake fee for certification: Aspirants can retake the exam if the exam voucher has a free retake option. If the exam retake option is available, one can opt for the exam after 14 days. 2) Oracle Certified Professional Java Programmer OCPJPIt is a professional-level certification program provided by Oracle for Java developers. It verifies the candidates' knowledge and professional expertise. Using this certification, aspirants and other hard-core Java programmers can distinguish themselves from those Java professionals who are not certified. It comes in the second level of Oracle's Java Certification list. There are two subcategories of this certification – OCPJP Java Standard Edition 8 (OCPJP 8) and  OCPJP Java Standard Edition 11 (OCPJP 11) This certification is preferable if someone has professional experience with Java or has already worked for some years in Java technology.  OCPJP8 comprises of topics like: Advanced Class Design Building Database Applications with JDBC Concurrency Exceptions and Assertions Generics and Collections Java Class Design Java File I/O (NIO.2) Java I/O Fundamentals Java Stream API Lambda Built-in Functional Interfaces Localization Use Java SE 8 Date/Time API OCPJP11 comprises of topics like: Annotations Built-in Functional Interfaces Concurrency Database Applications with JDBC Exception Handling and Assertions Functional Interface and Lambda Expressions Generics and Collections I/O (Fundamentals and NIO.2) Java Fundamentals Java Interfaces Java Stream API Lambda Operations on Streams Localization Migration to a Modular Application Parallel Systems Secure Coding in Java SE Application Services in a Modular ApplicationDemand and Benefits: Once you are a certified Professional Java Programmer (OCPJP), you can switch to better salary slabs and organizations that hire senior Java developers. This certification also helps in getting internal promotions as Java developers in different organizations and firms. The average salary of a certified professional Java developer is $ 5300 - $ 8610 per annum. Top companies and industries hiring Oracle Certified Professional Java Programmers are Oracle, Capgemini, Morgan Stanley, Chetu, Mphasis, etc. Where to take Training for Certification: KnowledgeHut has a fascinating course opportunity for Java developers and professionals for learning intermediate Java topics. It has hands-on learning with 32 hours of instructor-led online lectures. Apart from that, Oracle also provides exam vouchers for this certification course. Who should take the Training (roles) for Certification: Any Java programmer who wants to apply for a senior Java developer's role or start his/her career as a Java programmer can opt for this professional certification course. There is no other prerequisite to appear for this exam. Course fees for Certification: $ 245 Application fee for certification: $ 245 Exam fee for certification: $ 245 Retake fee for certification: Aspirants can retake the exam if the exam voucher has a free retake option. If the exam retake option is available, one can opt for the exam after 14 days.3. Oracle Certified Expert - Web Component Developer OCEWCDIt is an intermediate-level course offered by Oracle for Java web developers. The Oracle Certified Expert Web Component Developer is for web developers who want to write web applications using Java. Through this course, they can prove their expertise in developing web apps using JSP and Servlet technologies. It verifies your expertise in Servlet 3.0 and helps in creating dynamic Web content and Web services.  It comprises of topics like Understanding Java EE Architecture Managing Persistence using JPA entities and Bean Validation Implementing business logic using EJBs Using Java Message Service API Implement SOAP Services using JAX-WS and JAXB APIs Creating Java Web Applications using Servlets and JSPs Implementing REST Services using JAX-RS API Creating Java Applications using WebSockets Developing Web Applications using JSFs Securing Java EE 7 Applications Using CDI Beans Demand and Benefits: You can opt for this course once you are a certified Professional Java Programmer (OCPJP) or certified associated Java programmer. This certification course will help you get a job in organizations having rigorous work in Servlet, Java Server Page, JSF, and web microservices. The average salary of a certified professional Java developer is $ 8,850 - $ 11,930 per annum. Top companies and industries hiring Oracle Certified Web Component Developers are Amdocs, IBM, Oracle, Capgemini, SAP, Shine, Byjus, etc. Where to take Training for Certification: KnowledgeHut has a fascinating course opportunity for Java web developers (. It has hands-on learning with instructor-led online lectures and live projects. Apart from this, you can get online training from Oracle University as wellWho should take the Training (roles) for Certification: Any programmer or computer science aspirant who wants to settle as a Java web developer or start his/her career as a Java web content and web service developer can opt for this certification course. As a prerequisite, you have to pass the OCPJP to opt for this certification.  Course fees for Certification:  $ 245 Application fee for certification: $ 245 Exam fee for certification: $ 245 Retake fee for certification: Aspirants can retake the exam if the exam voucher has a free retake option. If the exam retake option is available, one can opt for the exam after 14 days. 4. Oracle Certified Professional Java Application Developer (OCPJAD)It is an advanced-level course offered by Oracle for Java application developers. The Oracle Certified Professional Java Application Developer (OCPJAD) is for software developers who want to write different applications and automation tools using Java. Through this course, developers can prove their expertise and abilities to develop and deploy applications through Java Enterprise Edition 7. OCPJAD is ideal for desktop application developers, frontend + backend app developers, software engineers, and application architects. It comprises of topics like Creating Batch API Developing CDI Beans Concepts of Concurrency Creating Java Applications with Web-Sockets Creating Java Web Applications with JSPs Developing Java Web Applications with Servlets Developing Web Applications with JSFs Implementing Business Logic with EJBs Performing REST Services with JAX-RS API Implementing SOAP Services with JAX-WS and JAXB APIs Java EE 7 system architecture Java EE 7 Security Techniques Java Message Service API Managing Persistence with JPA Entities and Bean-ValidationDemand and Benefits: Once you pass the Certified Professional Java Application Developer (OCPJAD), you can seek employment in organizations that work on critical application development and command higher salaries. This professional certification will give you exposure to develop APIs, implementing business logic using EJBs, create message services, and apply security systems. The average salary of a certified professional application developer is $ 9,800 - $ 13,910 per annum. Top companies and industries hiring Oracle Certified Professional Java Programmers are Oracle, Capgemini, NetSuite Inc., SAP, Cognizant, etc. Where to take Training for Certification: KnowledgeHut has a fascinating course opportunity with hands-on learning exposure and live projects. Apart from this, you can get online training from Oracle University as well. Who should take the Training (roles) for Certification: Any Java developer or full-stack application developer who wants to become a certified Java application developer or move to the specialized sector of API development using REST, security architect or software engineer can opt for this certification course. As a prerequisite, you should have passed the OCAJP certification.  Course fees for Certification:  $ 245 Application fee for certification: $ 245 Exam fee for certification: $ 245 Retake fee for certification: Aspirants can retake the exam if the exam voucher has a free retake option. If the exam retake option is available, one can opt for the exam after 14 days.5. Oracle Certified Master Java Enterprise Architect (OCMJEA)Large-scale development and service firms have different critical applications and systems to develop, manage, and maintain. Such systems require full-stack developers and specialized professionals with proven skills. Such organizations and MNCs hire only highly experienced professionals and specialists who can supervise the extensive operation, architect the defects, and define & develop systems as per requirements. The Oracle Certified Master Java Enterprise Architect (OCMJEA) is one of the most prestigious Java certifications a Java developer can achieve.  It comprises of topics like Architect Enterprise Applications through Java EE Developing Applications for the Java EE 6 Developing Applications for the Java EE 7 Developing Applications with Java EE 6 on WebLogic Server 12c Java Design Patterns Java EE 6: Develop Business Components with JMS & EJBs Java EE 6: Develop Database Applications with JPA Java EE 6: Develop Web Services with JAX-WS & JAX-RS Java EE 7: New Features Java SE 7: Develop Rich Client Applications Java SE 8: Programming Java SE 8 Fundamentals Object-Oriented Analysis and Design Using UML, etc. Demand and Benefits: Once you pass the Certified Master Java Enterprise Architect course, you get the essential skills and understanding of how to execute application development on an enterprise level. Such an experienced professional gains full-stack Java development skills. They get hired with the responsibility of undertaking Java projects from the very start to their final delivery. Many Certified Master Java Enterprise Architects work as managers or senior managerial roles in industries and firms. The average salary of a certified professional application developer is $ 14,000 - $ 19,210 per annum. Top companies and industries hiring Oracle Certified Professional Java Programmers are IBM, Oracle, Microsoft, HCL, Capgemini, NetSuite Inc., SAP, Cognizant, Atlassian, etc. Where to take Training for Certification: KnowledgeHut has a fascinating Java course  with hands-on learning exposure and a live project. Apart from that, a professional can train himself through ILT (Instructor-Led-in-Class), Learning Subscription, TOD (Training on Demand), LVC (Live Virtual Class), or classes delivered by Oracle Authorized Education Center . Other Oracle Authorized Partner Oracle Academy, Oracle University Training Center, or Oracle Workforce Development Program can also benefit and train you in this course.  Who should take the Training (roles) for Certification: Any Java developer or full-stack application developer who wants to move to a senior role in the enterprise-level or want to become a manager or team lead can opt for this certification course. As a prerequisite, you need to have passed the OCPJP certification.  Course fees for Certification:  $248 Application fee for certification: $ 248 Exam fee for certification: $ 248 Retake fee for certification: Aspirants can retake the exam if the exam voucher has a free retake option. If the exam retake option is available, one can opt for the exam after 14 days. Java is an evergreen programming language and is here to stay, at least for the next couple of decades. A vast community of professionals and entry-level aspirants enjoy the benefit of this pure object-oriented, class-based, multi-paradigm, high-level programming language. Java Certification requires proper training.KnowledgeHut has the required infrastructure and quality education faculty, both online and offline, to train aspirants for these Oracle Certifications. It caters to well-structured, industry-oriented Java certification training, explicitly designed to serve the candidates according to the latest industry needs. Getting proper training from KnowledgeHut will help aspirants master core knowledge of Java plus equip themselves with the industry standards to manage large projects. 
5969
Top IT Certifications for Java Developers in 2021

Programming languages are at the heart of comput... Read More

Best Python Certifications of 2021

Programming is always at the core of computer science and Information Technology. Every year millions of programmers graduate with a degree and look for opportunities in the job market. The demand for programmers is growing exponentially, and this demand is not going anytime soon. Python was released by Python Software Foundation in 1991, and in just a few years, has become the most popular and widely used programming language in various disciplines.Python is an interpreted, general-purpose, and high-level programming language developed by Guido Van Rossum. Today, companies use Python for GUI and CLI-based software development, web development (server-side), data science, machine learning, AI, robotics, drone systems, developing cyber-security tools, mathematics, system scripting, etc.According to TIOBE index, Python ranks second among all other programming languages. KnowledgeHut has some fascinating advanced-level courses on Python, such as Machine Learning using Python and Artificial Intelligence using Python.Once you gain expertise in writing Python programs, candidates can start learning advanced-level Python libraries and modules such as Pandas, SciPy, NumPy, Matplotlib, etc. There are different options one can explore after learning Python. These are data analysis, machine learning, cybersecurity, automation, web scraping, etc.Top Python Certifications of 2021Certified Entry-Level Python Programmer (PCEP) Certified Associate in Python Programmer (PCAP) Introduction to Programming Using Python by Microsoft Certified Professional in Python Programming 1 & 2 (PCPP 1 & 2) Certified Expert in Python Programming (PEPP) During the course of your Python certification training and exam preparation, you will develop different real-world projects and get familiar with case studies. Also, there will be hands-on lab experiences in Python programming. In this article, you will get to know the top five Python certifications of 2021 that can give you the launchpad you need to embark on a successful career.   1. Certified Entry-Level Python Programmer (PCEP): The PCEP is an entry-level Python certification. To enroll in this course, you need to have a basic understanding of how procedural programming works. Also, some knowledge of flowcharts and algorithm creation will benefit you. Through this certification, an aspirant can gain the core and fundamental understanding of Python. This certification from the Python Institute will make you proficient in Python programming and help you become a Python certified professional. Aspirants and professionals can choose Python as a career option/path and climb the Python Institute’s certification ladder from associate to professional.PCEP comprises of topics like Basic formatting and outputting methods Handling Boolean values Compilation vs. interpretation Constants, Variables and Variable naming conventions Defining user-defined functions Fundamentals of computer programming Inputting and converting Data Logical vs. bitwise operations in Python Looping and control statements Lists New data aggregates: Tuples and Dictionaries The assignment operator Primary kinds of data and numerical operators Rules governing the building of expressions Working with multi-dimensional arrays Different slicing operations Demand and Benefits: Having a PCEP certification verifies that the programmer or the aspirant has knowledge of all the necessary and fundamental Python concepts. The course also covers all the syntax and semantics of different Python constructs & data types offered by the language. This course brings crisp knowledge on general coding techniques using standard language infrastructure and basic programming skills using Python. The average entry-level salary of a Python programmer with this certification will be $ 5660 per annum. Top companies and industries hiring PCEP are Philips, Cataleya Pvt. Ltd., Deloitte, Zynga, Mphasis, VMware, etc.Where to take Training for Certification: Python Institute has all the study resources you need to prepare for this examination. Apart from that, you can join the Python course offered by KnowledgeHut  that has 24 hours of instructor-led training covering the core programming concepts like operators, control flow, functions, syntax & indentations. Who should take the Training (roles) for Certification: Any programmer or computer science aspirant, who wants to learn Python or start an internship or entry-level job as Python programmer can opt for this certification course. There is no other prerequisite to appear for this exam. Course fees for Certification: $ 295 Application fee for certification: $ 295 Exam fee for certification: $ 295 Retake fee for certification: If a candidate fails the exam, he/she can wait 15 days before being allowed to retake the exam for free. There is no limit on the number of times a candidate may retake the exam.2. Certified Associate in Python Programmer (PCAP):PCAP is another important second-level or associate-level certification exam for Python. This course and certification will give you the confidence to measure your skill and complete the Python-based coding tasks. It also facilitates competing for competitive coding sessions. This course also comprises the essential notions and concepts related to object-oriented programming. With this associated-level certification, you can stand unique in the competitive job market. PCAP comprises of topics like Basic formatting and outputting methods Python basics Using Boolean values Compilation vs. interpretation Variables and variable naming conventions Defining and using functions Fundamentals of computer programming Fundamentals of OOP  How to use OOPs in the Python programming language Generators and closures Inputting and converting of data Logical vs. bitwise operations Looping and control statements File processing for Python developers Name scope issues New data aggregates: tuples and dictionaries Primary kinds of data and numerical operators Python modules Inheritance in Python Rules for creating expressions Working with multi-dimensional arrays Strings, lists, and other Python data structures The assignment operator The concept of exceptions and implementation Demand and Benefits: Having a PCAP certification verifies that the programmer or the aspirant has all the necessary and essential concepts of intermediate-level Python programming. The course also covers all the fundamental concepts of different Python constructs & fundamentals of OOP. This course brings crisp knowledge on general coding techniques using standard language infrastructure and basic programming skills using Python. The approximate salary of a Python programmer with this certification will be $7000 to $11,262 per annum. Top companies and industries hiring PCAP are CareCentrix, Accenture, Deutsche Bank, Collabera, NetApp, Capgemini, Tech Mahindra, Myntra, etc. Where to take Training for Certification: Python Institute (https://pythoninstitute.org/free-python-courses/) has all the study resources you need to prepare for this examination. You can also get a comprehensive training by enrolling for the Python course offered by KnowledgeHut (https://www.knowledgehut.com/programming/python-programming-certification-training) that has 24 hours of instructor-led training covering the core programming concepts like operators, control flow, functions, syntax & indentations. Who should take the Training (roles) for Certification: Any programmer or computer science aspirant, who wants to build a career in Python or pursue an associate-level job as a Python programmer or developer, can opt for this certification course. There is no other prerequisite to appear for this exam. Course fees for Certification: $ 295 Application fee for certification: $ 295 Exam fee for certification: $ 295 Retake fee for certification: If a candidate fails the exam, he/she can wait 15 days before being allowed to retake the exam for free. There is no limit to the number of times a candidate may retake an exam. 3. Introduction to Programming Using Python by MicrosoftIt is another popular entry-level Python certification by Microsoft (https://docs.microsoft.com/en-us/learn/certifications/exams/98-381). This certification covers all the syntax, data types, and basic understanding of Python. It also teaches how to logically solve any problem using Python constructs. Candidates wanting to enroll for this course are expected to have had some instruction or hands-on experience of approximately 100 hours with the Python programming language, including debugging skills, logic development, understanding conditional & decision-making statements, and maintaining well-formed well documented Python code. Microsoft’s Introduction to Programming Using Python comprises of topics like Basics of Python Using Boolean values Fundamentals of computer programming Interpretations Variables and variable naming conventions Defining and using functions Indexing and slicing operations Type conversions Basic formatting and outputting Data Types and Operators Control Flow with Decisions and Loops Construct Data structures Jump Statements Perform Input and Output Operations Document and Structure Code Comments and white-spaces Perform Operations Using Modules and Tools Demand and Benefits: Having a Microsoft certification verifies that the Python programmer or the aspirant has all the necessary and fundamental Python concepts. The course also covers all the syntax and semantics of different Python constructs & data types offered by the language. Anyone with this certification will have a better understanding of core Python, and the candidate can stand out in the competitive exams from the rest. The average entry-level salary of a Python programmer with this certification will be $ 5660 per annum. Top companies and industries hiring Python professionals with this credential are Cataleya Pvt. Ltd., Zynga, VMware, Mphasis, Deloitte, Capgemini, etc. Where to take Training for Certification: Microsoft has a paid five-day instructor-led course to prepare for this examination. Apart from that, you can join the Python course offered by KnowledgeHut that has 24 hours of instructor-led training covering the core programming concepts like operators, control flow, functions, syntax & indentations. Who should take the Training (roles) for Certification: Any programmer or computer science aspirant, who wants to learn Python or start an internship or entry-level job as Python programmer, can opt for this certification course. There is no other prerequisite to appear for this exam. Course fees for Certification: $ 127 Application fee for certification: $ 127 Exam fee for certification: $ 127 Retake fee for certification: Exam retake is free. If the candidate fails to achieve a passing score on the first attempt, he/she must wait 24 hours before retaking the exam. 4. Certified Professional in Python Programming 1 & 2 (PCPP 1 & 2):Once you sound knowledge of the core concepts of Python or have 3 to 5 years of experience in Python programming, you may prepare for professional Python certification. Certified Professional in Python Programming 1 certifications will reflect your experience and programming skills in the following areas: Text File Handling GUI-based Programming Encapsulation Inheritance Advanced Object-Oriented Programming PEP conventions Metaprogramming Communicating with a program's environment Using Libraries and Modules Importing math, science, and engineering modules Having this globally recognized credential will make you stand out in a competitive job market. Many recruiting agencies and firms are looking for professional Python programmers who can develop and deploy applications. Certified Professional in Python Programming 2 (PCPP2) is another advanced-level professional certification course offering proficiency in Python-MySQL database handling. Certified Professional in Python Programming 2 certification will reflect your experience and programming skills in the following areas: Basic directory structure CRUD operations Design patterns Observer and Proxy Singleton and State Design Template Method Model-View-Controller using Python Multiprocessing, threading, subprocess, and multiprocessor synchronization Relational database management using Python MySQL and SQL commands Sharing, storing and installing packages Network programming in Python Application testing techniques and principles Demand and Benefits: Having a PCPP certification verifies that the Python developer has all the necessary and essential skills of a professional Python programmer. The course covers all the advanced object-oriented programming concepts, GUI programming, etc. This course brings crisp knowledge for experienced professionals to make them stand out in the software development industry. The approximate salary of a Python programmer with this certification will be $ 12,053 to $ 14,700 per annum. Top companies and industries hiring PCPP certified professionals are Dell, Accenture, SG Analytics, HCL, Oracle, Capgemini, Tech Mahindra, Flipkart, etc. Where to take Training for Certification: Python Institute has all the study resources you need to prepare for this examination. Apart from that, you can join the Python course offered by KnowledgeHut that has 32 hours of instructor-led training covering the advanced programming concepts like database handling, OOPs, logical layout, data visualization, etc. Who should take the Training (roles) for Certification: Any professional, programmer, or experienced Python developer - who wants to settle as a senior Python developer or pursue an experienced-level job as a Python programmer or developer can opt for this certification course. The candidate should have the Certified Associate in Python Programmer (PCAP) certification or few years of work experience in Python. Course fees for Certification: $ 195 Application fee for certification: $ 195 Exam fee for certification: $ 195  Retake fee for certification: If a candidate fails the exam, he/she can wait 15 days before being allowed to retake the exam for free. There is no limit to the number of times a candidate may retake an exam. 5. Certified Expert in Python Programming (CEPP):This Python certification tag is for experts who complete all the OpenEDG Python Institute's Programming certification program (PCAP-31-xx, PCPP-32-1-xx, and PCPP-32-2-xx exams). It is the most advanced credential a Python developer can achieve from the Python Institute. Having this globally recognized credential will verify your expertise in Python programming. It highlights expertise in the universal concepts of Python programming. Also, this certification showcases the skills in resolving typical implementation challenges on different verticals of Python. Demand and Benefits: Having a CEPP certification verifies that the Python developer has industry level expertise in Python. This certification designates that the candidate has covered all the topics from basics to advance object-oriented programming concepts, GUI programming, etc. Using this certification, one can apply for a senior software development role, Python developer’s role, team lead, agile project management lead, and other senior job roles. Many professionals switch their careers to Big Data, Data Analytics, Machine learning, and deep learning after completing this certification. The approximate salary of a Python programmer with this certification will be $ 17,350 to $ 39,945 per annum. Top companies and industries hiring CEPPs are Amazon, Tesla, HSBC, Google, HCL, Oracle, Capgemini, Qualcomm, 6sense, Vitrana, and other top service-based companies. Where to take Training for Certification: Python Institute has all the study resources you need to prepare for PCAP-31-xx, PCPP-32-1-xx, and PCPP-32-2-xx examination. Once a candidate has passed all the certifications, he/she becomes recognized as an Open EDG Python Institute Certified Expert in Python Programming (CEPP). Who should take the Training (roles) for Certification: Any professional, Python expert, or senior Python developer, who wants to settle as a team lead or pursue an experienced-level job profile can opt for these certifications to reach at this level.  Course fees for Certification: $ 295 + $ 295 + $ 195 Application fee for certification: $ 295 + $ 295 + $ 195 Exam fee for certification: $ 295 + $ 295 + $ 195  Retake fee for certification: There is no retake fee Conclusion We trust this article gave you a better insight into different Python certifications! Whether you are starting out as a coder, or are an experienced Python programmer looking at making a splash in the industry, having a Python certification and proper knowledge of Python will elevate your programming career. Python is one of the top programming languages that can help you land different jobs in web development, app development, data science, cybersecurity, networking, web scraping, robotics, IoT, etc. If you aren't sure which online resource will be more informative for your Python certification, KnowledgeHut (https://www.knowledgehut.com/) has all the study materials and expert trainers who will help you reach the pinnacle of Python expertise. Receiving a Python certification, apart from academics and degrees, will make you stand out from the rest. So, start preparing for one today! 
9574
Best Python Certifications of 2021

Programming is always at the core of computer scie... Read More

20% Discount