For enquiries call:

Phone

+1-469-442-0620

Aage ki Socho

HomeBlogProgrammingWhat is PyPI & How To Publish An Open-Source Python Package to PyPI

What is PyPI & How To Publish An Open-Source Python Package to PyPI

Published
27th Sep, 2023
Views
view count loader
Read it in
19 Mins
In this article
    What is PyPI & How To Publish An Open-Source Python Package to PyPI

    The Python Standard Library comprises of sophisticated and robust capabilities for working with larger packages. You will find modules for working with sockets and with files and file paths.

    Though there might be great packages that Python comes with, there are more exciting and fantastic projects outside the standard library which are mostly called the Python Packaging Index (PyPI). It is nothing but a repository of software for the Python programming language.

    The PyPI package is considered as an important property for Python being a powerful language. You can get access to thousands of libraries starting from Hello World to advanced deep learning libraries.

    What is PyPI

    "PyPI" should be pronounced like "pie pea eye", specifically with the "PI" pronounced as individual letters, but rather as a single sound. This minimizes confusion with the PyPy project, which is a popular alternative implementation of the Python language.

    The Python Package Index, abbreviated as PyPI is also known as the Cheese Shop. It is the official third-party software repository for Python, just like CPAN is the repository for  Perl.  Some package managers such as pip, use PyPI as the default source for packages and their dependencies. More than 113,000 Python packages can be accessed through PyPI.

    How to use PyPI

    To install the packages from PyPI you would need a package installer. The recommended package installer for PyPI is ‘pip’. Pip is installed along when you install Python on your system. To learn more about ‘pip’, you may go through our article on “What is pip”. The pip command is a tool for installing and managing Python packages, such as those found in the Python Package Index. It is a replacement for easy_install.  

    To install a package from the Python Package Index, just open up your terminal and type in a search query using the PIP tool. The most common usage for pip is to install, upgrade or uninstall a package. If you are overwhelmed with the terms here python programming bootcamp would be the right place to start. 

    Starting with a Small Python Package

    We will start with a small Python package that we will use as an example to publish to PyPI. You can get the full source code from the GitHub repository. The package is called reader and it is an application by which you can download and read articles. 

    Below shows the directory structure of reader :

    reader/ 
    │ 
    ├── reader/ 
    │   ├── config.txt 
    │   ├── feed.py 
    │   ├── __init__.py 
    │   ├── __main__.py 
    │   └── viewer.py 
    │ 
    ├── tests/ 
    │   ├── test_feed.py 
    │   └── test_viewer.py 
    │ 
    ├── MANIFEST.in 
    ├── README.md 
    └── setup.py 

    The source code of the package is in a reader subdirectory that is bound with a configuration file. The GitHub repository also contains few tests in a separate subdirectory. 

    In the coming sections, we will discuss the working of the reader package and also take a look at the special files which include setup.py, README.md, MANIFEST.in, and others. 

    Using the Article Reader

    The reader is a primitive data format used for providing users with the latest updated content. You can download the frequent articles from the article feed with the help of reader

    You can get the list of articles using the reader:

    $ python -m reader
    The latest tutorials from Real Python (https://realpython.com/)
      0 How to Publish an Open-Source Python Package to PyPI
      1 Python "while" Loops (Indefinite Iteration)
      2 Writing Comments in Python (Guide)
      3 Setting Up Python for Machine Learning on Windows
      4 Python Community Interview With Michael Kennedy
      5 Practical Text Classification With Python and Keras
      6 Getting Started With Testing in Python
      7 Python, Boto3, and AWS S3: Demystified
      8 Python's range() Function (Guide)
      9 Python Community Interview With Mike Grouchy
     10 How to Round Numbers in Python
     11 Building and Documenting Python REST APIs With Flask and Connexion – Part 2
     12 Splitting, Concatenating, and Joining Strings in Python
     13 Image Segmentation Using Color Spaces in OpenCV + Python
     14 Python Community Interview With Mahdi Yusuf
     15 Absolute vs Relative Imports in Python
     16 Top 10 Must-Watch PyCon Talks
     17 Logging in Python
     18 The Best Python Books
     19 Conditional Statements in Python

    The articles in the list are numbered. So if you want to read a particular article, you can just write the same command along with the number of the article you desire to read.

    For reading the article on “How to Publish an Open-Source Python Package to PyPI”, just add the serial number of the article:

    $ python -m reader 0 
    # How to Publish an Open-Source Python Package to PyPI 
    
    Python is famous for coming with batteries included. Sophisticated 
    capabilities are available in the standard library. You can find modules 
    for working with sockets, parsing CSV, JSON, and XML files, and 
    working with files and file paths.
    
    However great the packages included with Python are, there are many 
    fantastic projects available outside the standard library. These are 
    most often hosted at the Python Packaging Index (PyPI), historically 
    known as the Cheese Shop. At PyPI, you can find everything from Hello 
    World to advanced deep learning libraries. 
    ... 
    ... 
    ...

    You can read any of the articles in the list just by changing the article number with the command. 

    Quick Look

    The package comprises of five files which are the working hands of the reader. Let us understand the implementations one by one: 

    • config.txt -  It is a text configuration file that specifies the URL of the feed of articles. The configparser standard library is able to read the text file. This type of file contains key-value pairs that are distributed into different sections.  
    # config.txt
    [feed]
    url=https://realpython.com/atom.xml
    • __main__.py - It is the entry point of your program whose duty is to control the main flow of the program. The double underscores denote the specialty of this file. Python executes the contents of the __main__.py file. 
    # __main__.py
    from configparser import ConfigParser 
    from importlib import resources 
    import sys
    
    from reader import feed 
    from reader import viewer
    
    def main():
        # Read URL of the Real Python feed from config file 
        configure=ConfigParser()
        configure.read_string(resources.readtext("reader","config.txt")) 
        URL=configure.get("feed","url")
    
        # If an article ID is given, show the article 
        if len(sys.argv) > 1: 
            article = feed.getarticle(URL, sys.argv[1]) 
            viewer.show(article)
    
        # If no ID is given, show a list of all articles 
        else:
          site = feed.getsite(URL) 
          titles = feed.gettitles(URL) 
          viewer.showlist(site,titles) 
    if __name__ == "__main__":
       main() 
    • __init__.py - It is also considered a special file because of the double underscore. It denotes the root of your package in which you can keep your package constants, your documentations and so on. 
    # __init__.py
    
    # Version of the realpython-reader package 
    __version__= "1.0.0"

    __version__ is a special variable in Python used for adding numbers to your package which was introduced in PEP 396. The variables which are defined in __init__.py are available as variables in the namespace also. 

    >>> import reader
    >>> reader.__version__
    '1.0.0'
    • feed.py - In the __main__.py, you can see two modules feed and viewer are imported which perform the actual work. The file feed.py  is used to read from a web feed and parse the result.  
    # feed.py
    
     import feedparser
     import html2text
    
    Cached_Feeds = dict()
    
    def _feed(url): 
       """Only read a feed once, by caching its contents"""
      if url not in _CACHED_FEEDS:
          Cached_Feeds[url]=feedparser.parse(url)
      return Cached_Feeds[url]
    • viewer.py -  This file module contains two functions show() and show_list()
    # viewer.py
    
    def show(article): 
    """Show one article""" 
    print(article)
    
     def show_list(site,titles): 
    """Show list of articles"""
    print(f"The latest tutorials from {site}")
    for article_id,title in enumerate(titles):
    print(f"{article_id:>3}{title}")

    The function of show() is to print one article to the console. On the other hand, show_list prints a list of titles.

    Calling a Package 

    You need to understand which file you should call to run the reader in cases where your package consists of four different source code files. The Python interpreter consists of an -m option that helps in specifying a module name instead of a file name.

    An example to execute two commands with a script hello.py:

    $ python hello.py
    Hi there!
    
    $ python -m hello
    Hi there!

    The two commands above are equivalent. However, the latter one with -m has an advantage. You can also call Python built-in modules with the help of it: 

    $ python -m antigravity
    Created new window in existing browser session.

    The -m option also allows you to work with packages and modules:

    $ python -m reader
    ...

    The reader only refers to the directory. Python looks out for the file named __main__.py, if the file is found, it is executed otherwise an error message is printed: 

    $ python -m math
    python: No code object available for math

    Preparing Your Package

    Since now you have got your package, let us understand the necessary steps that are needed to be done before the uploading process. This often requires programmers to learn advanced python programming skills.

    Naming the Package 

    Finding a good and unique name for your package is the first and one of the most difficult tasks. PyPI has more than 150,000 packages already in their list, so chances are that your favorite name might be already taken. 

    You need to perform some research work in order to find a perfect name. You can also use the PyPI search to verify whether it is already used or not.  

    We will be using a more descriptive name and call it realpython-reader so that the reader package can be easily found on PyPI and then use it to install the package using pip:

    $ pip install realpython-reader

    However, the name we have given is realpython-reader but when we import it, it is still called as reader:

    >>> import reader
    >>> help(reader)
    
    >>> from reader import feed
    >>> feed.get_titles()
    ['How to Publish an Open-Source Python Package to PyPI', ...]

    You can use a variety of names for your package while importing on PyPI but it is suggested to use the same name or similar ones for better understanding. 

    Configuring your Package

    Your package should be included with some basic information which will be in the form of a setup.py file. The setup.py is the only fully supported way of providing information, though Python consists of initiatives that are used to simplify this collection of information.

    The setup.py file should be placed in the top folder of your package. An example of a setup.py  for reader

    import pathlib
    from setuptools import setup
    
    # The directory containing this file
    HERE = pathlib.Path(__file__).parent
    
    # The text of the README file
    README = (HERE/"README.md").read_text()
    
    # This call to setup() does all the work
    setup(
       name="realpython-reader", 
       version="1.0.1", 
       descp="The latest Python tutorials", 
       long_descp=README,
       long_descp_content="text/markdown", 
       URL="https://github.com/realpython/reader", 
       author="Real Python", 
       authoremail="office@realpython.com", 
       license="MIT", 
       classifiers="License :: OSI Approved :: MIT License""Programming Language :: Python :: 3""Programming Language :: Python :: 3.7", 
       ], 
       packages=["reader"], 
       includepackagedata=True, 
       installrequires=["feedparser","html2text"], 
       entrypoints="console_scripts":[ 
               "realpython=reader.__main__:main", 
           ] 
       }, 
     ) 

    The necessary parameters available in setuptools in the call to setup() are as follows: 

    • name - The name of your package as being appeared on PyPI 
    • version - the present version of your package 
    • packages - the packages and subpackages which contain your source code 

    You will also have to specify any subpackages if included. setuptools contains find_packages() whose job is to discover all your subpackages. You can also use it in the reader project:

    from setuptools import find_packages,setup 
    setup( 
        ...
        packages=find_packages(exclude=("tests",)),
        ...
    ) 

    You can also add more information along with name, version, and packages which will make it easier to find on PyPI.

    Two more important parameters of  setup() : 

    • install_requires - It lists the dependencies your package has to the third-party libraries. feedparser and html2text are listed since they are the dependencies of reader.
    • entry_points - It creates scripts to call a function within your package. Our script realpython calls the main() within the reader/__main__.py file.

    Documenting Your Package

    Documenting your package before releasing it is an important step. It can be a simple README file or a complete tutorial webpage like galleries or an API reference.  

    At least a README file with your project should be included at a minimum which should give a quick description of your package and also inform about the installation process and how to use it. In other words, you need to include your README as the long_descp argument to setup() which will eventually be displayed on PyPI. 

    PyPI uses Markdown for package documentation. You can use the setup() parameter long_description_content_type to get the PyPI format you are working with. 

    When you are working with bigger projects and want to add more documentation to your package, you can take the help of websites like GitHub and Read the Docs

    Versioning Your Package 

    Similarly like documentation, you need to add a version to your package. PyPI promises reproducibility by allowing a user to do one upload of a particular version for a package. If there are two systems with the same version of a package, it will behave in an exact manner. 

    PEP 440 of Python provides a number of schemes for software versioning. However, for a simple project, let us stick to a simple versioning scheme. 

    A simple versioning technique is semantic versioning which has three components namely MAJOR, MINOR, and PATCH and some simple rules about the incrementation process of each component: 

    • Increment the MAJOR version when you make incompatible API changes. 
    • Increment the MINOR version when you add functionality in a backward-compatible manner. 
    • Increment the PATCH version when you make backward-compatible bug fixes. (Source

    You need to specify the different files inside your project. Also, if you want to verify whether the version numbers are consistent or not, you can do it using a tool called Bumpversion

    $ pip install bumpversion

    Adding Files To Your Package

    Your package might include other files other than source code files like data files, binaries, documentation and configuration files. In order to add such files, we will use a manifest file. In most cases, setup() creates a manifest that includes all code files as well as README files.   

    However, if you want to change the manifest, you can create a manifest template of your own. The file should be called MANIFEST.in and it will specify rules for what needs to be included and what needs to be excluded: 

    include reader/*.txt

    This will add all the .txt files in the reader directory. Other than creating the manifest, the non-code files also need to be copied. This can be done by setting the include_package_data toTrue

    setup( 
        ...
        include_package_data=True...
    )

    Publishing to PyPI 

    For publishing your package to the real world, you need to first start with registering yourself on PyPI and also on TestPyPI, which is useful because you can give a trial of the publishing process without any further consequences. 

    You will have to use a tool called Twine to upload your package ton PyPI: 

    $ pip install twine

    Building Your Package

    The packages on PyPI are wrapped into distribution packages, out of which the most common are source archives and Python wheels. A source archive comprises of your source code and other corresponding support files wrapped into one tar file. On the other hand, a Python wheel is a zip archive that also contains your code. However, the wheel can work with any extensions, unlike source archives. 

    Run the following command in order to create a source archive and a wheel for your package: 

    $ python setup.py sdist bdist_wheel

    The command above will create two files in a newly created directory called dist, a source archive and a wheel: 

    reader/
    │ 
    └── dist/ 
        ├── realpython_reader-1.0.0-py3-none-any.whl 
        └── realpython-reader-1.0.0.tar.gz 

    The command-line arguments like the sdist and bdist_wheel arguments are all implemented int the upstream distutils standard library. Using the --help-commands option, you list all the available arguments: 

    $ python setup.py --help-commands 
    Standard commands: 
      build             build everything needed to install 
      build_py          "build" pure Python modules (copy to build directory) 
      < ... many more commands ...>

    Testing Your Package 

    In order to test your package, you need to check whether the distribution packages you have newly created contain the expected files. You also need to list the contents of the tar source archive on Linux and macOS platforms: 

    $ tar tzf realpython-reader-1.0.0.tar.gz 
    realpython-reader-1.0.0/ 
    realpython-reader-1.0.0/setup.cfg 
    realpython-reader-1.0.0/README.md 
    realpython-reader-1.0.0/reader/ 
    realpython-reader-1.0.0/reader/feed.py 
    realpython-reader-1.0.0/reader/__init__.py 
    realpython-reader-1.0.0/reader/viewer.py 
    realpython-reader-1.0.0/reader/__main__.py 
    realpython-reader-1.0.0/reader/config.txt 
    realpython-reader-1.0.0/PKG-INFO 
    realpython-reader-1.0.0/setup.py 
    realpython-reader-1.0.0/MANIFEST.in 
    realpython-reader-1.0.0/realpython_reader.egg-info/ 
    realpython-reader-1.0.0/realpython_reader.egg-info/SOURCES.txt 
    realpython-reader-1.0.0/realpython_reader.egg-info/requires.txt 
    realpython-reader-1.0.0/realpython_reader.egg-info/dependency_links.txt 
    realpython-reader-1.0.0/realpython_reader.egg-info/PKG-INFO 
    realpython-reader-1.0.0/realpython_reader.egg-info/entry_points.txt 
    realpython-reader-1.0.0/realpython_reader.egg-info/top_level.txt 

    On Windows, you can make use of the utility tool 7-zip to look inside the corresponding zip file. 

    You should make sure that all the subpackages and supporting files are included in your package along with all the source code files as well as the newly built files. 

    You can also run twine check on the files created in dist to check if your package description will render properly on PyPI: 

    $ twine check dist/*
    Checking distribution dist/realpython_reader-1.0.0-py3-none-any.whl: Passed 
    Checking distribution dist/realpython-reader-1.0.0.tar.gz: Passed 

    Uploading Your Package

    Now you have reached the final step,i.e. Uploading your package to PyPI. Make sure you upload your package first to TestPyPI to check whether it is working according to your expectation and then use the Twine tool and instruct it to upload your newly created distribution: 

    $ twine upload --repository-url https://test.pypi.org/legacy/ dist/* 

    After the uploading process is over, you can again go to TestPyPI and look at your project being displayed among the new releases.  

    However, if you have your own package to publish, the command is short: 

    $ twine upload dist/* 

    Give your username and password and it’s done. Your package has been published on PyPI. To look up your package, you can either search it or look at the Your projects page or you can just directly go to the URL of your project: pypi.org/project/your-package-name/. 

    After completing the publishing process, you can download it in your system using pip: 

    $ pip install your-package-name

    Top Cities Where KnowledgeHut Conduct Python Certification Course Online 

    Python Course in BangalorePython Course in ChennaiPython Course in Singapore
    Python Course in DelhiPython Course in DubaiPython Course in Indore
    Python Course in PunePython Course in BerlinPython Course in Trivandrum
    Python Course in NashikPython Course in MumbaiPython Certification in Melbourne
    Python Course in HyderabadPython Course in KolkataPython Course in Noida

    Miscellaneous Tools 

    There are some useful tools that are good to know when creating and publishing Python packages. Some of these are mentioned below. 

    Virtual Environments 

    Each virtual environment has its own Python binary and can also have its own set of installed Python packages in its directories. These packages are independent in nature. Virtual environments are useful in situations where there are a variety of requirements and dependencies while working with different projects. 

    You can grab more information about virtual environments in  the following references: 

    It is recommended to check your package inside a basic virtual environment so that to make sure all necessary dependencies in your setup.py file are included. 

    Cookiecutter 

    Cookiecutter sets up your project by asking a few questions based on a template. Python contains many different templates. 

    Install Cookiecutter using pip: 

    $ pip install cookiecutter

    To understand cookiecutter, we will use a template called pypackage-minimal. If you want to use a template, provide the link of the template to the cookiecutter: 

    $ cookiecutter https://github.com/kragniz/cookiecutter-pypackage-minimal 
    author_name [Louis Taylor]: Real Python 
    author_email [louis@kragniz.eu]: office@realpython.com 
    package_name [cookiecutter_pypackage_minimal]: realpython-reader 
    package_version [0.1.0]: 
    package_description [...]: Read Real Python tutorials 
    package_url [...]: https://github.com/realpython/reader 
    readme_pypi_badge [True]: 
    readme_travis_badge [True]: False 
    readme_travis_url [...]: 

    Cookiecutter sets up your project after you have set up answered a series of questions. The template above will create the following files and directories: 

    realpython-reader/ 
    │ 
    ├── realpython-reader/ 
    │   └── __init__.py 
    │ 
    ├── tests/ 
    │   ├── __init__.py 
    │   └── test_sample.py 
    │ 
    ├── README.rst 
    ├── setup.py 
    └── tox.ini 

    You can also take a look at the documentation of cookiecutter for all the available cookiecutters and how to create your own template. 

    Summary

    Let us sum up the necessary steps we have learned in this article so far to publish your own package - 

    • Finding a good and unique name for your package
    • Configuring your package using setup.py 
    • Building your package 
    • Publishing your package to PyPI 

    Moreover, you have also learned to use a few new tools that help in simplifying the process of publishing packages.  

    You can reach out to Python’s Packaging Authority for more detailed and comprehensive information. To gain more knowledge about Python tips and tricks, check our Python tutorial and get a good hold over coding in Python by joining Knowledgehut python programming bootcamp.

    Profile

    Priyankur Sarkar

    Data Science Enthusiast

    Priyankur Sarkar loves to play with data and get insightful results out of it, then turn those data insights and results in business growth. He is an electronics engineer with a versatile experience as an individual contributor and leading teams, and has actively worked towards building Machine Learning capabilities for organizations.

    Share This Article
    Ready to Master the Skills that Drive Your Career?

    Avail your free 1:1 mentorship session.

    Select
    Your Message (Optional)

    Upcoming Programming Batches & Dates

    NameDateFeeKnow more
    Course advisor icon
    Course Advisor
    Whatsapp/Chat icon