What is PyPI & How To Publish An Open-Source Python Package to PyPI

Read it in 19 Mins

Last updated on
31st May, 2022
Published
22nd Oct, 2019
Views
12,599
What is PyPI & How To Publish An Open-Source Python Package to PyPI

The Python Standard Library comprises of sophisticated and robust capabilities for working with larger packages. You will find modules for working with sockets and with files and file paths.

Though there might be great packages that Python comes with, there are more exciting and fantastic projects outside the standard library which are mostly called the Python Packaging Index (PyPI). It is nothing but a repository of software for the Python programming language.

The PyPI package is considered as an important property for Python being a powerful language. You can get access to thousands of libraries starting from Hello World to advanced deep learning libraries.

What is PyPI

"PyPI" should be pronounced like "pie pea eye", specifically with the "PI" pronounced as individual letters, but rather as a single sound. This minimizes confusion with the PyPy project, which is a popular alternative implementation of the Python language.

The Python Package Index, abbreviated as PyPI is also known as the Cheese Shop. It is the official third-party software repository for Python, just like CPAN is the repository for  Perl.  Some package managers such as pip, use PyPI as the default source for packages and their dependencies. More than 113,000 Python packages can be accessed through PyPI.

How to use PyPI

To install the packages from PyPI you would need a package installer. The recommended package installer for PyPI is ‘pip’. Pip is installed along when you install Python on your system. To learn more about ‘pip’, you may go through our article on “What is pip”. The pip command is a tool for installing and managing Python packages, such as those found in the Python Package Index. It is a replacement for easy_install.  

To install a package from the Python Package Index, just open up your terminal and type in a search query using the PIP tool. The most common usage for pip is to install, upgrade or uninstall a package. If you are overwhelmed with the terms here python programming bootcamp would be the right place to start. 

Starting with a Small Python Package

We will start with a small Python package that we will use as an example to publish to PyPI. You can get the full source code from the GitHub repository. The package is called reader and it is an application by which you can download and read articles. 

Below shows the directory structure of reader :

reader/ 
│ 
├── reader/ 
│   ├── config.txt 
│   ├── feed.py 
│   ├── __init__.py 
│   ├── __main__.py 
│   └── viewer.py 
│ 
├── tests/ 
│   ├── test_feed.py 
│   └── test_viewer.py 
│ 
├── MANIFEST.in 
├── README.md 
└── setup.py 

The source code of the package is in a reader subdirectory that is bound with a configuration file. The GitHub repository also contains few tests in a separate subdirectory. 

In the coming sections, we will discuss the working of the reader package and also take a look at the special files which include setup.py, README.md, MANIFEST.in, and others. 

Using the Article Reader

The reader is a primitive data format used for providing users with the latest updated content. You can download the frequent articles from the article feed with the help of reader

You can get the list of articles using the reader:

$ python -m reader
The latest tutorials from Real Python (https://realpython.com/)
  0 How to Publish an Open-Source Python Package to PyPI
  1 Python "while" Loops (Indefinite Iteration)
  2 Writing Comments in Python (Guide)
  3 Setting Up Python for Machine Learning on Windows
  4 Python Community Interview With Michael Kennedy
  5 Practical Text Classification With Python and Keras
  6 Getting Started With Testing in Python
  7 Python, Boto3, and AWS S3: Demystified
  8 Python's range() Function (Guide)
  9 Python Community Interview With Mike Grouchy
 10 How to Round Numbers in Python
 11 Building and Documenting Python REST APIs With Flask and Connexion – Part 2
 12 Splitting, Concatenating, and Joining Strings in Python
 13 Image Segmentation Using Color Spaces in OpenCV + Python
 14 Python Community Interview With Mahdi Yusuf
 15 Absolute vs Relative Imports in Python
 16 Top 10 Must-Watch PyCon Talks
 17 Logging in Python
 18 The Best Python Books
 19 Conditional Statements in Python

The articles in the list are numbered. So if you want to read a particular article, you can just write the same command along with the number of the article you desire to read.

For reading the article on “How to Publish an Open-Source Python Package to PyPI”, just add the serial number of the article:

$ python -m reader 0 
# How to Publish an Open-Source Python Package to PyPI 

Python is famous for coming with batteries included. Sophisticated 
capabilities are available in the standard library. You can find modules 
for working with sockets, parsing CSV, JSON, and XML files, and 
working with files and file paths.

However great the packages included with Python are, there are many 
fantastic projects available outside the standard library. These are 
most often hosted at the Python Packaging Index (PyPI), historically 
known as the Cheese Shop. At PyPI, you can find everything from Hello 
World to advanced deep learning libraries. 
... 
... 
...

You can read any of the articles in the list just by changing the article number with the command. 

Quick Look

The package comprises of five files which are the working hands of the reader. Let us understand the implementations one by one: 

  • config.txt -  It is a text configuration file that specifies the URL of the feed of articles. The configparser standard library is able to read the text file. This type of file contains key-value pairs that are distributed into different sections.  
# config.txt
[feed]
url=https://realpython.com/atom.xml
  • __main__.py - It is the entry point of your program whose duty is to control the main flow of the program. The double underscores denote the specialty of this file. Python executes the contents of the __main__.py file. 
# __main__.py
from configparser import ConfigParser 
from importlib import resources 
import sys

from reader import feed 
from reader import viewer

def main():
    # Read URL of the Real Python feed from config file 
    configure=ConfigParser()
    configure.read_string(resources.readtext("reader","config.txt")) 
    URL=configure.get("feed","url")

    # If an article ID is given, show the article 
    if len(sys.argv) > 1: 
        article = feed.getarticle(URL, sys.argv[1]) 
        viewer.show(article)

    # If no ID is given, show a list of all articles 
    else:
      site = feed.getsite(URL) 
      titles = feed.gettitles(URL) 
      viewer.showlist(site,titles) 
if __name__ == "__main__":
   main() 
  • __init__.py - It is also considered a special file because of the double underscore. It denotes the root of your package in which you can keep your package constants, your documentations and so on. 
# __init__.py

# Version of the realpython-reader package 
__version__= "1.0.0"

__version__ is a special variable in Python used for adding numbers to your package which was introduced in PEP 396. The variables which are defined in __init__.py are available as variables in the namespace also. 

>>> import reader
>>> reader.__version__
'1.0.0'
  • feed.py - In the __main__.py, you can see two modules feed and viewer are imported which perform the actual work. The file feed.py  is used to read from a web feed and parse the result.  
# feed.py

 import feedparser
 import html2text

Cached_Feeds = dict()

def _feed(url): 
   """Only read a feed once, by caching its contents"""
  if url not in _CACHED_FEEDS:
      Cached_Feeds[url]=feedparser.parse(url)
  return Cached_Feeds[url]
  • viewer.py -  This file module contains two functions show() and show_list()
# viewer.py

def show(article): 
"""Show one article""" 
print(article)

 def show_list(site,titles): 
"""Show list of articles"""
print(f"The latest tutorials from {site}")
for article_id,title in enumerate(titles):
print(f"{article_id:>3}{title}")

The function of show() is to print one article to the console. On the other hand, show_list prints a list of titles.

Calling a Package 

You need to understand which file you should call to run the reader in cases where your package consists of four different source code files. The Python interpreter consists of an -m option that helps in specifying a module name instead of a file name.

An example to execute two commands with a script hello.py:

$ python hello.py
Hi there!

$ python -m hello
Hi there!

The two commands above are equivalent. However, the latter one with -m has an advantage. You can also call Python built-in modules with the help of it: 

$ python -m antigravity
Created new window in existing browser session.

The -m option also allows you to work with packages and modules:

$ python -m reader
...

The reader only refers to the directory. Python looks out for the file named __main__.py, if the file is found, it is executed otherwise an error message is printed: 

$ python -m math
python: No code object available for math

Preparing Your Package

Since now you have got your package, let us understand the necessary steps that are needed to be done before the uploading process. This often requires programmers to learn advanced python programming skills.

Naming the Package 

Finding a good and unique name for your package is the first and one of the most difficult tasks. PyPI has more than 150,000 packages already in their list, so chances are that your favorite name might be already taken. 

You need to perform some research work in order to find a perfect name. You can also use the PyPI search to verify whether it is already used or not.  

We will be using a more descriptive name and call it realpython-reader so that the reader package can be easily found on PyPI and then use it to install the package using pip:

$ pip install realpython-reader

However, the name we have given is realpython-reader but when we import it, it is still called as reader:

>>> import reader
>>> help(reader)

>>> from reader import feed
>>> feed.get_titles()
['How to Publish an Open-Source Python Package to PyPI', ...]

You can use a variety of names for your package while importing on PyPI but it is suggested to use the same name or similar ones for better understanding. 

Configuring your Package

Your package should be included with some basic information which will be in the form of a setup.py file. The setup.py is the only fully supported way of providing information, though Python consists of initiatives that are used to simplify this collection of information.

The setup.py file should be placed in the top folder of your package. An example of a setup.py  for reader

import pathlib
from setuptools import setup

# The directory containing this file
HERE = pathlib.Path(__file__).parent

# The text of the README file
README = (HERE/"README.md").read_text()

# This call to setup() does all the work
setup(
   name="realpython-reader", 
   version="1.0.1", 
   descp="The latest Python tutorials", 
   long_descp=README,
   long_descp_content="text/markdown", 
   URL="https://github.com/realpython/reader", 
   author="Real Python", 
   authoremail="office@realpython.com", 
   license="MIT", 
   classifiers="License :: OSI Approved :: MIT License""Programming Language :: Python :: 3""Programming Language :: Python :: 3.7", 
   ], 
   packages=["reader"], 
   includepackagedata=True, 
   installrequires=["feedparser","html2text"], 
   entrypoints="console_scripts":[ 
           "realpython=reader.__main__:main", 
       ] 
   }, 
 ) 

The necessary parameters available in setuptools in the call to setup() are as follows: 

  • name - The name of your package as being appeared on PyPI 
  • version - the present version of your package 
  • packages - the packages and subpackages which contain your source code 

You will also have to specify any subpackages if included. setuptools contains find_packages() whose job is to discover all your subpackages. You can also use it in the reader project:

from setuptools import find_packages,setup 
setup( 
    ...
    packages=find_packages(exclude=("tests",)),
    ...
) 

You can also add more information along with name, version, and packages which will make it easier to find on PyPI.

Two more important parameters of  setup() : 

  • install_requires - It lists the dependencies your package has to the third-party libraries. feedparser and html2text are listed since they are the dependencies of reader.
  • entry_points - It creates scripts to call a function within your package. Our script realpython calls the main() within the reader/__main__.py file.

Documenting Your Package

Documenting your package before releasing it is an important step. It can be a simple README file or a complete tutorial webpage like galleries or an API reference.  

At least a README file with your project should be included at a minimum which should give a quick description of your package and also inform about the installation process and how to use it. In other words, you need to include your README as the long_descp argument to setup() which will eventually be displayed on PyPI. 

PyPI uses Markdown for package documentation. You can use the setup() parameter long_description_content_type to get the PyPI format you are working with. 

When you are working with bigger projects and want to add more documentation to your package, you can take the help of websites like GitHub and Read the Docs

Versioning Your Package 

Similarly like documentation, you need to add a version to your package. PyPI promises reproducibility by allowing a user to do one upload of a particular version for a package. If there are two systems with the same version of a package, it will behave in an exact manner. 

PEP 440 of Python provides a number of schemes for software versioning. However, for a simple project, let us stick to a simple versioning scheme. 

A simple versioning technique is semantic versioning which has three components namely MAJOR, MINOR, and PATCH and some simple rules about the incrementation process of each component: 

  • Increment the MAJOR version when you make incompatible API changes. 
  • Increment the MINOR version when you add functionality in a backward-compatible manner. 
  • Increment the PATCH version when you make backward-compatible bug fixes. (Source

You need to specify the different files inside your project. Also, if you want to verify whether the version numbers are consistent or not, you can do it using a tool called Bumpversion

$ pip install bumpversion

Adding Files To Your Package

Your package might include other files other than source code files like data files, binaries, documentation and configuration files. In order to add such files, we will use a manifest file. In most cases, setup() creates a manifest that includes all code files as well as README files.   

However, if you want to change the manifest, you can create a manifest template of your own. The file should be called MANIFEST.in and it will specify rules for what needs to be included and what needs to be excluded: 

include reader/*.txt

This will add all the .txt files in the reader directory. Other than creating the manifest, the non-code files also need to be copied. This can be done by setting the include_package_data toTrue

setup( 
    ...
    include_package_data=True...
)

Publishing to PyPI 

For publishing your package to the real world, you need to first start with registering yourself on PyPI and also on TestPyPI, which is useful because you can give a trial of the publishing process without any further consequences. 

You will have to use a tool called Twine to upload your package ton PyPI: 

$ pip install twine

Building Your Package

The packages on PyPI are wrapped into distribution packages, out of which the most common are source archives and Python wheels. A source archive comprises of your source code and other corresponding support files wrapped into one tar file. On the other hand, a Python wheel is a zip archive that also contains your code. However, the wheel can work with any extensions, unlike source archives. 

Run the following command in order to create a source archive and a wheel for your package: 

$ python setup.py sdist bdist_wheel

The command above will create two files in a newly created directory called dist, a source archive and a wheel: 

reader/
│ 
└── dist/ 
    ├── realpython_reader-1.0.0-py3-none-any.whl 
    └── realpython-reader-1.0.0.tar.gz 

The command-line arguments like the sdist and bdist_wheel arguments are all implemented int the upstream distutils standard library. Using the --help-commands option, you list all the available arguments: 

$ python setup.py --help-commands 
Standard commands: 
  build             build everything needed to install 
  build_py          "build" pure Python modules (copy to build directory) 
  < ... many more commands ...>

Testing Your Package 

In order to test your package, you need to check whether the distribution packages you have newly created contain the expected files. You also need to list the contents of the tar source archive on Linux and macOS platforms: 

$ tar tzf realpython-reader-1.0.0.tar.gz 
realpython-reader-1.0.0/ 
realpython-reader-1.0.0/setup.cfg 
realpython-reader-1.0.0/README.md 
realpython-reader-1.0.0/reader/ 
realpython-reader-1.0.0/reader/feed.py 
realpython-reader-1.0.0/reader/__init__.py 
realpython-reader-1.0.0/reader/viewer.py 
realpython-reader-1.0.0/reader/__main__.py 
realpython-reader-1.0.0/reader/config.txt 
realpython-reader-1.0.0/PKG-INFO 
realpython-reader-1.0.0/setup.py 
realpython-reader-1.0.0/MANIFEST.in 
realpython-reader-1.0.0/realpython_reader.egg-info/ 
realpython-reader-1.0.0/realpython_reader.egg-info/SOURCES.txt 
realpython-reader-1.0.0/realpython_reader.egg-info/requires.txt 
realpython-reader-1.0.0/realpython_reader.egg-info/dependency_links.txt 
realpython-reader-1.0.0/realpython_reader.egg-info/PKG-INFO 
realpython-reader-1.0.0/realpython_reader.egg-info/entry_points.txt 
realpython-reader-1.0.0/realpython_reader.egg-info/top_level.txt 

On Windows, you can make use of the utility tool 7-zip to look inside the corresponding zip file. 

You should make sure that all the subpackages and supporting files are included in your package along with all the source code files as well as the newly built files. 

You can also run twine check on the files created in dist to check if your package description will render properly on PyPI: 

$ twine check dist/*
Checking distribution dist/realpython_reader-1.0.0-py3-none-any.whl: Passed 
Checking distribution dist/realpython-reader-1.0.0.tar.gz: Passed 

Uploading Your Package

Now you have reached the final step,i.e. Uploading your package to PyPI. Make sure you upload your package first to TestPyPI to check whether it is working according to your expectation and then use the Twine tool and instruct it to upload your newly created distribution: 

$ twine upload --repository-url https://test.pypi.org/legacy/ dist/* 

After the uploading process is over, you can again go to TestPyPI and look at your project being displayed among the new releases.  

However, if you have your own package to publish, the command is short: 

$ twine upload dist/* 

Give your username and password and it’s done. Your package has been published on PyPI. To look up your package, you can either search it or look at the Your projects page or you can just directly go to the URL of your project: pypi.org/project/your-package-name/

After completing the publishing process, you can download it in your system using pip: 

$ pip install your-package-name

Miscellaneous Tools 

There are some useful tools that are good to know when creating and publishing Python packages. Some of these are mentioned below. 

Virtual Environments 

Each virtual environment has its own Python binary and can also have its own set of installed Python packages in its directories. These packages are independent in nature. Virtual environments are useful in situations where there are a variety of requirements and dependencies while working with different projects. 

You can grab more information about virtual environments in  the following references: 

It is recommended to check your package inside a basic virtual environment so that to make sure all necessary dependencies in your setup.py file are included. 

Cookiecutter 

Cookiecutter sets up your project by asking a few questions based on a template. Python contains many different templates. 

Install Cookiecutter using pip: 

$ pip install cookiecutter

To understand cookiecutter, we will use a template called pypackage-minimal. If you want to use a template, provide the link of the template to the cookiecutter: 

$ cookiecutter https://github.com/kragniz/cookiecutter-pypackage-minimal 
author_name [Louis Taylor]: Real Python 
author_email [louis@kragniz.eu]: office@realpython.com 
package_name [cookiecutter_pypackage_minimal]: realpython-reader 
package_version [0.1.0]: 
package_description [...]: Read Real Python tutorials 
package_url [...]: https://github.com/realpython/reader 
readme_pypi_badge [True]: 
readme_travis_badge [True]: False 
readme_travis_url [...]: 

Cookiecutter sets up your project after you have set up answered a series of questions. The template above will create the following files and directories: 

realpython-reader/ 
│ 
├── realpython-reader/ 
│   └── __init__.py 
│ 
├── tests/ 
│   ├── __init__.py 
│   └── test_sample.py 
│ 
├── README.rst 
├── setup.py 
└── tox.ini 

You can also take a look at the documentation of cookiecutter for all the available cookiecutters and how to create your own template. 

Summary 

Let us sum up the necessary steps we have learned in this article so far to publish your own package - 

  • Finding a good and unique name for your package
  • Configuring your package using setup.py 
  • Building your package 
  • Publishing your package to PyPI 

Moreover, you have also learned to use a few new tools that help in simplifying the process of publishing packages.  

You can reach out to Python’s Packaging Authority for more detailed and comprehensive information. To gain more knowledge about Python tips and tricks, check our Python tutorial and get a good hold over coding in Python by joining Knowledgehut python programming bootcamp.

Profile

Priyankur Sarkar

Data Science Enthusiast

Priyankur Sarkar loves to play with data and get insightful results out of it, then turn those data insights and results in business growth. He is an electronics engineer with a versatile experience as an individual contributor and leading teams, and has actively worked towards building Machine Learning capabilities for organizations.