Easy way to test file uploads in Flask with PyTest

Not that long ago I wanted to write a test for an endpoint that takes your uploaded files and sends it over to another service. I needed to test the initial upload and we already had some testing infrastructure in place, so I just needed to add new test cases. However, we were using PyTest which I haven't used before, as almost all of my testing experience was done in unittest. We were also working with the latest Flask 1.1.x. I had some issues coming up with code that would be suitable for our tests and the initial Google search did not yield satisfying results so I decided to write this short article about how to properly test file uploads in Flask.

All the code I will be describing is available on my Github, feel free to check it out, and run it yourself. I have started to work with Makefiles and I like it a lot, so I will be running tests and Flask app with "make" command.

Simple upload endpoint

Let's consider some basic endpoint implementation which can receive images and stores them in local "upload" folder:

@app.route('/upload', methods=['POST'])
def upload():
    try:
        logging.info(request.files)
        uploaded_file = request.files['image']
    except:
        raise InvalidUsage("Please provide image to upload", status_code=400)

    if uploaded_file.content_type != 'image/jpeg':
        raise InvalidUsage("Only JPEG images are allowed", status_code=400)

    try:
        filename = secure_filename(uploaded_file.filename)
        destination_file = os.path.join(app.config['UPLOAD_FOLDER'], filename)
        uploaded_file.save(destination_file)
        return {"file": filename}, 201
    except:
        raise
        raise InvalidUsage('Failed to upload image', status_code=500)
Excerpt of app.py

I want to explain pieces of this function, just in case you are not familiar with all of the functionalities.

First of all, we access files from form-data upload and pick out the field named "image":

uploaded_file = request.files['image']

Then we can perform a MIME-type check on that file to see if it's an image or something else (if you are not sure what is a MIME-type, you can read about it here):

if uploaded_file.content_type != 'image/jpeg':
        raise InvalidUsage("Only JPEG images are allowed", status_code=400)

Please note that this particular check is extremely naive as it decides Mime-type based only on the file extension. If you want to have true MIME-type detection, you can use Python-Magic. But this naive approach will be good for now as we will exploit later on in testing examples.

Now after we can be somewhat sure that we have an image that is being uploaded, we can process it, but first, we might want to find out the name of the file, because I want to keep the filename the same in my upload folder:

filename = secure_filename(uploaded_file.filename)

This secure_filename function is a part of Werkzeug's utils package, it strips all unnecessary and dangerous characters from the uploaded file's name.

Then we want to construct a path where our uploaded file will be stored:

destination_file = os.path.join(app.config['UPLOAD_FOLDER'], filename)

The upload folder value is set at the top of app.py and we are using os.path.join function to construct a system-compatible path to the file.

You can also construct a file path manually by doing something like this:

destination_file = app.config['UPLOAD_FOLDER'] + '/' + filename

And I bet some of you reading this article are doing just that in your programs and apps. If you are developing on a Windows machine you have to use backslashes, but you are probably deploying your app on a Linux server, and that requires forwards slashes. Those system differences in paths can cause errors and bugs and this is why you should use os.path.join(path, *paths), which will solve those problems for you automatically.

After that we can finally save this image to the disk:

uploaded_file.save(destination_file)

And we want to return the clean filename along with the status code of 201 (Created). Here we can use new Flask's syntax where we return a tuple of dictionary and integer, the dictionary will get converted to JSON and integer will be our response status code:

return {"file": filename}, 201
I really like this shorthand return syntax

The test

Let's now set up our testing infrastructure. First of all, we want to create a folder named "tests" in our project and convert it to a package by creating an __init__.py file inside.

As we will be using Flask's test client to test our endpoint, we want to configure as a fixture so it can be easily accessed in our tests. We can do that if we create a file named conftest.py:

import pytest
import app

@pytest.fixture(scope='module')
def test_client():
    flask_app = app.app
    testing_client = flask_app.test_client()
    ctx = flask_app.app_context()
    ctx.push()
    yield testing_client
    ctx.pop()
conftest.py

Now we can start writing some tests! Here where I have stumbled from the beginning, it seemed like everyone on the face of the Earth was writing tests with dummy streams of data, something like this:

def test_upload_text_stream(test_client):
    file_name = "fake-text-stream.txt"
    data = {
        'image': (io.BytesIO(b"some initial text data"), file_name)
    }
    response = test_client.post('/upload', data=data)
    assert response.status_code == 400
Uploading dummy stream of data as text file

In this test, we define a file_name with .txt extension, because we want to make sure our naive MIME-type checked does not allow text files to be uploaded, we only want images.

Then we create a data dictionary payload with the field 'image' in which we will store a tuple with 2 values:

  • a byte stream of data
  • name of the file we are uploading

Note that we can name the file with anything we want, we don't have to send the real name of the file.

After we have everything ready, we POST the data to our test client and we check for a status code of 400 to check that server has rejected the payload.

If we want to do the same dummy data stream upload, but make sure we succeed, we can do this:

def test_upload_image_stream(test_client):
    image_name = "fake-image-stream.jpg"
    data = {
        'image': (io.BytesIO(b"some random data"), image_name)
    }
    response = test_client.post('/upload', data=data)
    assert response.status_code == 201
    assert response.json['file'] == image_name
Uploading dummy stream of data as JPEG

This test is sending virtually the same data as the one above, but we have a different filename with a different extension (jpg instead of txt), so we "fool" our MIME-type detector, and the server will save the file, respond with status code 201 (Created) and return the name of the file.

But I wanted to run these tests with files I already had in my repository. I did not want to use any dummy data streams, because I was also testing integration with other services and dummy data would interfere with test results.

To use real files inside your project, you will create tests that do this:

def test_upload_textfile(test_client):
    file = "random-file.txt"
    data = {
        'image': (open(file, 'rb'), file)
    }
    response = test_client.post('/upload', data=data)
    assert response.status_code == 400
Testing upload with real text file

What you have to do, is provide a byte stream of a real file, you have to use the open() function and set the mode to 'rb' (read+binary). This will read the file as a binary and provide the correct data for the upload.

To check our upload with a nice picture, I picked this picture:

Pizza-cat or Cat-pizza?

I want to upload this cute picture of a Pizza-Cat to my endpoint and make sure it works correctly:

def test_upload_image_file(test_client):
    image = "pizza-cat.jpg"
    data = {
        'image': (open(image, 'rb'), image)
    }
    response = test_client.post('/upload', data=data)
    assert response.status_code == 201
    assert response.json['file'] == image
Testing upload with real JPEG

You can run all the tests in the project by running "make test".

After I have found out how easy it was to create tests that use real files for upload I immediately wanted to write an article about it. I think we all should write tests for the code we write because tested code is more reliable than untested code and if you know more about how to create proper tests for something harder, like file uploads, you will get faster in a habit of writing tests along with your code.

Good luck with your tests!

M.