Pipenv: Python Dev Workflow for Humans
I took the title of this article directly from the documentation of Pipenv because it is amazingly accurate. If you have ever used package managers available for other languages, such as NPM or CocoaPods, and came to Python, you might have noticed that our Pip is not so sophisticated. It can also be a good thing since the barrier for entry is lowered, it's easier, but it also encourages filling our requirements.txt file with a bunch of dependencies from other libraries. Pipenv is here to help us manage our dependencies, it makes working with virtual environments more seamless and ensures that we are always using correct packages when developing our code. Want to know more? Read on!
Example App
Let's say we want to make an app that uploads a file to an S3 bucket. I picked this example because Amazon's Boto3 library has quite a few dependencies, and I think something like this is a prime example of how to optimize our requirements file. This app should be pretty straightforward and I didn't want to just pick a lot of arbitrary dependencies just to show you how full can your requirements file get. I will be doing everything in a virtual environment so we can get an accurate picture of our project dependencies. This example code was taken from https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html.
Everything you are about to see was executed inside this repository. You can clone it and follow along, I highly recommend you see it for yourself. If you want to make your life easier, also use Pyenv (read about it here) and Python 3.8.2 which was used to create examples for this article.
Basic Pip usage
Let's see how we would use Pip to build this app. First, we would create a virtual environment and activate it:
cd pip-example
python -m venv venv
. venv/bin/activate
Let's now check our Pip freeze command to make sure we don't have any packages installed:
pip freeze
Let's now install boto3 library and check our "pip freeze":
pip install boto3
pip freeze
boto3==1.14.28
botocore==1.17.28
docutils==0.15.2
jmespath==0.10.0
python-dateutil==2.8.1
s3transfer==0.3.3
six==1.15.0
urllib3==1.25.10
The reason why we are checking "pip freeze" is that a lot of people, when working on a project, install packages first and add them to requirements later, and they often use the command "pip freeze" and forward that output to requirements.txt. But this list of dependencies is not something we want to save and version in our repository as the only real dependency from this list is boto3.
Right now it's quite easy to pick out our dependency and copy it to the requirements file, but how about when you introduce a couple more libraries? Once your list of dependencies gets longer, your willingness to pick out correct dependencies from a list gets lower and you start flooding your requirements with unnecessary libraries, which will inevitably produce clashes among their dependencies.
Let's now deactivate our virtual environment to check out Pipenv:
deactivate
Basic Pipenv usage
First, we need to install Pipenv, which is pretty easy on a macOS with Homebrew installed:
brew install pipenv
Otherwise you should follow installation steps from Pipenv documentation.
Let's now do the same thing, but with Pipenv:
cd pipenv-example
pipenv --three
pipenv shell
If you are using pyenv (you should be) and you have specified a local version of Python, you can just use Pipenv shell and everything will be set up for you. You can read about pyenv in my previous article.
Once we are inside our virtual environment, we can install our dependency:
pipenv install boto3
You will see some output from that command, something similar to this:
Installing boto3…
Adding boto3 to Pipfile's [packages]…
✔ Installation Succeeded
Pipfile.lock not found, creating…
Locking [dev-packages] dependencies…
Locking [packages] dependencies…
Building requirements...
Resolving dependencies...
✔ Success!
Updated Pipfile.lock (dfb424)!
Installing dependencies from Pipfile.lock (dfb424)…
🐍 ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 0/0 — 00:00:00
"That's a lot of new things, it mentions Pipfile, Pipfile.lock, what are these things?" you might ask. Let's look at the contents of Pipfile first:
[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true
[dev-packages]
[packages]
boto3 = "*"
[requires]
python_version = "3.8"
Pipfile is the basis of our dependency list, it holds all of our main dependencies that we have installed and nothing more. It only has boto3 in the packages section, the version is specified as *, which means it should install the latest version (we should probably change that to fixed version). To install the same version as in Pip example (1.14.28) we can simply run:
pipenv install boto3==1.14.28
And that will update our Pipfile to this:
[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true
[dev-packages]
[packages]
boto3 = "==1.14.28"
[requires]
python_version = "3.8"
This file is the reason we want to use Pipenv, it provides a source of truth for dependencies for our project, and we can be certain that the packages it lists are the true dependencies.
You can look at the contents of Pipfile.lock here (file is pure JSON and it's quite large and hard to read inside this article). You can see all the dependencies that the boto3 library requires, it contains versions, hashes, markers, all the goodies that package managers require.
To deactivate our virtual environment when using Pipenv it's as simple as this:
exit
I must say I prefer the activation and deactivation of virtual environments in Pipenv over regular venvs, as it closer tracks the behavior of regular shells.
"Pip freeze" problem
Consider this scenario (which happened to me not that long ago): You will start working on a new project, you clone the repo to your machine, you create your virtual environment, you run "pip install -r requirements.txt", and you'll get an error message like this:
ERROR: zappa 0.51.0 has requirement python-dateutil<2.7.0,
but you'll have python-dateutil 2.8.1 which is incompatible.
This is caused by storing the output of "pip freeze" directly, instead of creating it carefully line by line, dependency by dependency. It's just a pile of libraries, their dependencies, and dependencies of dependencies.
How do you resolve that kind of mess now? The easiest thing is to go to the requirements file and just remove a specific row for python-dateutil, run it again, and then it works with no errors. But how can we know that we don't need that 2.8.1 version? After all, it was specified as a dependency, so it should stay at 2.8.1, right?
You can never be 100% certain unless you go over the entire codebase and make sure. Some pieces of code somewhere might be reliant on that specific version and it might cause a crash later on. In my case, I went over the codebase as it was relatively small, and after I made sure it's not used anywhere I removed it and reinstalled the packages.
Pipenv does not have the "freeze issue"
Since Pipenv updates it's Pipfile during the installation of a new package, it does not have the same above-mentioned problem. It keeps its list of dependencies clean and tidy, however, there can always be issues with clashing sub-dependencies, but at least you will get notified about it when you install a new package that is causing the clash, and you have a chance to correct it yourself, and not push it on the next person who downloads and installs dependencies for your code.
Migrating to Pipenv
If you want to start using Pipenv today, you don't have to do anything special, but you might encounter "pip freeze" issues on your projects. When you activate the virtual environment from Pipenv and it detects a requirements.txt file, it will try to install all the dependencies from it, however, it does not know which are the actual dependencies, so it just adds all of them to Pipfile as packages. You can try it out yourself:
cd migrate-example
pipenv shell
---
✔ Successfully created virtual environment!
Virtualenv location: /Users/martin/.local/share/virtualenvs/migrate-example-aEf2PuZd
requirements.txt found, instead of Pipfile! Converting…
✔ Success!
Our new Pipfile after migration:
[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true
[dev-packages]
[packages]
boto3 = "==1.14.28"
botocore = "==1.17.28"
docutils = "==0.15.2"
jmespath = "==0.10.0"
python-dateutil = "==2.8.1"
s3transfer = "==0.3.3"
six = "==1.15.0"
urllib3 = "==1.25.10"
[requires]
python_version = "3.8"
You can clean packages that you know are not essential for your code and are just sub-dependencies. After that simply create a lock file and you can commit both Pipfile and Pipfile.lock to your repo:
pipenv lock
---
Locking [dev-packages] dependencies…
Locking [packages] dependencies…
Building requirements...
Resolving dependencies...
✔ Success!
Updated Pipfile.lock (dc9747)!
Specific version of library in Pipfile
When I started working with Pipenv, I had one question for which it was a bit difficult to find an answer, at least at that time: "How can I specify the exact version of a library to use in my Pipfile?".
I was trying many combinations in Pipfile, all of them incorrect:
boto3 = 1.14.28
boto3 = "1.14.28"
boto3==1.14.28
boto3=="1.14.28"
As you can see from example Pipfile above, the correct usage is:
boto3 = "==1.14.28"
I hope this article showed you how useful it can be to use Pipenv as a package manager for your project. It does not matter if your team or project is small, medium, or large, all of them could benefit from a properly defined dependency list that is well maintained and Pipenv helps you with that immensely. It helps to prevent errors and bugs, and I am really glad that we already have a tool like this at our disposal in Python. It's time to start using it!