Developing a ML Model Performance Validation Web App using Streamlit

yuenhern

10 min readJun 23, 2021

To educators:

Do you often give an entire dataset to your students for ML modelling homework?
Were you worried that your students will rig the test dataset in order to score high, rather than performing better EDAs or model tuning?
Have you ever thought of developing an online, 24/7 solution checker web app for students to submit their predictions but had to resort to Google Sheets because you were unsure where to start?

If your answers to the above questions are YES, this article is the perfect fit for you!

Acronyms:

ML: machine learning
EDA: exploratory data analysis
IDE: Integrated Development Environment
TOML: Tom’s Obvious, Minimal Language

1. Background

One fine day, I was looking for a dataset on Kaggle for practice, and found this interesting dataset on Term Deposit Prediction Dataset by Brajesh Mohapatra. The description of the dataset is quoted below (source), with important points in bold:

Your client is a retail banking institution. Term deposits are a major source of income for a bank. A term deposit is a cash investment held at a financial institution. Your money is invested for an agreed rate of interest over a fixed amount of time, or term. The bank has various outreach plans to sell term deposits to their customers such as email marketing, advertisements, telephonic marketing and digital marketing. Telephonic marketing campaigns still remain one of the most effective way to reach out to people. However, they require huge investment as large call centers are hired to actually execute these campaigns. Hence, it is crucial to identify the customers most likely to convert beforehand so that they can be specifically targeted via call. You are provided with the client data such as : age of the client, their job type, their marital status, etc. Along with the client data, you are also provided with the information of the call such as the duration of the call, day and month of the call, etc. Given this information, your task is to predict if the client will subscribe to term deposit.

The dataset comes with a train dataset (train.csv), a test dataset (test.csv) and a solution checker (solution_checker.xlsx), which is a Microsoft Excel file with two sheets. The solution_checker.xlsx is the file that we are interested in.

According to the dataset author, the workflow to use the solution checker file are as follows (source):

You can use solution_checker.xlsx to generate score (accuracy) of your predictions. This is an excel sheet where you are provided with the test IDs and you have to submit your predictions in the “subscribed” column. Below are the steps to submit your predictions and generate score:
a. Save the predictions on test.csv file in a new CSV file.
b. Open the generated CSV file, copy the predictions and paste them in the subscribed column of solution_checker.xlsx file.
c. Your score will be generated automatically and will be shown in Your Accuracy Score column.

Note: The solution checker uses accuracy as the only evaluation metric.

This got me thinking:

Can we make this more interactive?
Can we make this process easier?
Can we include other evaluation metrics on top of accuracy — precision, recall, F1 score?
Can we remove the copy-pasting process entirely?

If you have similar questions popping up in your head, welcome to join me in my quest!

2. Development Tools

There are a lot of frameworks out there for web apps development (a simple Google search will give you an exhaustive list). I am not from a computer science background and I have never learnt HTML, CSS or JavaScript before — therefore the learning curve for me will be high if I wanted to do it the conventional way.

Luckily for me, there is an “unconventional” way to do this — big thank you to Streamlit! Streamlit is a Python framework for you to build web apps using pure Python, so that you can focus on what matters — building working apps — without worrying about HTML, CSS or JavaScript. There are a lot of tutorials out there but I really recommend reading their documentation which is beautifully written and easy to comprehend.

Tools I used for my development and deployment:

IDE: PyCharm
Deployment: Streamlit Share (see the announcement here)

Python environment:

Python version: 3.8.10
NumPy version: 1.20.2
Pandas version: 1.2.4
Scikit-learn version: 0.24.2
Streamlit version: 0.82.0

Do note that in the requirements.txt, Python version is not specified.

3. Project Structure

Next up, we are going to dive into the project structure. In short, this web app should be able to perform following steps:

Accepts a user-uploaded file containing predictions
Verify that the number of instances in the uploaded file is 13,564 (without column headers)
Validate predictions against the solutions and returns accuracy, precision, recall and F1 score
Show the evaluation metrics on the web app

Note: Users are informed to upload only their predictions, excluding column headers.

At a glance, the project components are as follows:

metadata: A folder containing metadata (media) for README.
test: A folder containing CSV files containing predictions for development and testing. This will not be included as part of the public repo and listed in .gitignore file.
webapp: A folder containing everything needed for the web app to work, both locally and online.
.gitignore: A file to specify which file to omit when pushing to GitHub.
LICENSE: A license file for this project (I am using Apache 2.0 for this project).
requirements.txt: A file to specify the dependencies needed to be downloaded later during the deployment (see Section 2).

Let’s talk more about the webapp folder. It contains .streamlit folder, __init__.py, streamlit_app.py and verify.py. We will have a look at each component, one by one.

The .streamlit Folder

Streamlit Sharing apps must be deployed from a public GitHub repo, hence it defeats the purpose if the solutions are available in the repository. Workaround: We can use Streamlit’s secrets management feature for this.

This folder contains a TOML file — a configuration file containing the secrets (the actual labels of the test set) which the web apps will use but must be hidden from the public.

The original solution_checker.xlsx contains 13,564 instances, therefore, I created another Jupyter Notebook to parse the data from the original XLSX file and write it into a new file named “secrets.toml”.

Then, I moved the file to this project and placed in the .streamlit folder — to enable it to be used during development. Don’t forget to add .streamlit folder into your .gitignore as well.

The index and solutions of the actual test labels, written in TOML format. (Source: Author)

The verify.py module

Next, we need two things: a script to verify the number of instances in the uploaded submission file and to verify the predictions of the uploaded submission. We can write a script named verify.py containing functions to do both tasks.

yuenherny/TermDepositSolutionChecker-WebApp/blob/main/webapp/verify.py

The verify.py module containing functions to check the number of instances and its predictions in the uploaded file.

github.com

A short description on each function:

verify_upload(): This function reads the uploaded CSV file and retrieves the actual solutions from secrets. Then, it checks if both files are similar in terms of number of instances, else, it will print out an error message and raise an error.
verify_solution(): This function reads the uploaded CSV file, retrieves the actual solutions from secrets and converts them to a NumPy array. Both arrays are used as arguments to calculate accuracy, precision, recall and F1 score using Scikit-learn modules and return them.

Now that we have functions to perform both tasks, let’s stitch them together using a main script, named streamlit_app.py.

The streamlit_app.py script

We need streamlit_app.py to chain up all the previously defined functions, and to behave as we intended.

Recall that we need the web app to:

Accept a user-uploaded file containing predictions
Verify that the number of instances in the uploaded file is 13,564 (without column headers) — accomplished by verify_upload()
Validate predictions against the solutions and returns accuracy, precision, recall and F1 score — accomplished by verify_solution()
Show the evaluation metrics on the web app

Since №2 and №3 has been sorted out, our streamlit_app.py only need to perform №1 and №4.

Instead of asking users to copy their predictions and paste it into a field, the app uses the st.file_uploader() function to enable users to upload a file and the app takes over from there.

yuenherny/TermDepositSolutionChecker-WebApp

The streamlit_app.py combines all the functions and resources to perform what the app is supposed to do.

github.com

A short description on streamlit_app.py:

Creates a widget where users can upload their CSV files
Verify the uploaded file using verify_upload() function
Validate the submitted predictions using verify_solution() function
Prints out the accuracy, precision, recall and F1 score on the web app

4. Testing the App Locally

Next, we want to see how the app looks like when it is deployed. But before we deploy it on Streamlit Share, we can test it locally by hooking it up to one of the local ports.

To test the app, just enter the following command in the PyCharm terminal:

streamlit run streamlit_app.py

But first, ensure that you are in the correct environment (please do not use your base environment for this project!)

Running your app locally (Source: Author)

You will see something like this, with the URL “localhost:XXXX”:

Deploying your app locally (Source: Author)

Congratulations, our app is halfway done! Next, let’s deploy our web app via Streamlit Share.

5. Deploying Web App on Streamlit Share

Streamlit Share is a feature by Streamlit for hobbyists out there to quickly deploy data science web apps — without needing to know much about web deployment frameworks.

Before deploying, you need to ensure a few things:

Your source code is publicly available on GitHub.
You have a requirements.txt which specifies the dependencies needed for your web app to work (see mine here). However, you do not need to specify the Python version as we will specify it at the Streamlit Share settings, later.

On top of that, it is recommended to practice project management best practices — always specify your project license! This is for you to specify what users can do with your project. I am using Apache 2.0 license for this project. Check this out if you are unsure which license to choose.

We are using Streamlit’s secret management in our project, which means that we need to have a .gitignore file specifying the files we do not wish to push to GitHub remote repo. In our case, we should list down the .streamlit and test folder in it because we don’t want the solutions to be publicly accessible.

To deploy, head to here. You will see the page below. If you have yet to have an invite, request for an invite (marked as “1” in the figure below). Else, you may sign in (marked as ”2" in the figure below).

Once you have an invite, sign in using the GitHub account registered in your request for invite.

Streamlit Share login page. (Source: Author)

You will arrive at the home page of your dashboard. All the apps that you have deployed will be shown here.

Streamlit Share dashboard. (Source: Author)

To deploy a new web app, hit “New app” and a dialogue box will appear. Now, you need to do the following:

You can copy your GitHub repo URL or click on the field; the repos you have in your associated GitHub account will appear.
Choose the branch that you want to deploy from. Normally its “master” or “main”.
Specify the main file to execute. In our case, streamlit_app.py is our main file.
Click “Advanced settings…” and another dialogue box will appear. For Python version, we are going to use Python 3.8, since we developed our application in this Python version.
Now, we need to copy the contents from our secrets.toml and paste it into the “Secrets” and hit “Save”.
Hit “Deploy!”

Dialogue box for deployment settings. (Source: Author)

Solutions copy-pasted into the “Secrets” field. (Source: Author)

6. Wait for the App to Build Successfully

Now, wait patiently at the waiting page for the app to build…

Our web app is building… (Source: Author)

If the build is successful, you will see your app is live! Now, test it by uploading some files to see if it is working as intended.

7. Conclusion

Throughout this tutorial, we have built a web app for anyone to validate how good is their ML model on the Term Deposit Prediction Dataset.

If you are an educator who wants to give your students a challenge, this is definitely a way to really test their data science skills!

If you are keen to do the same thing but need a code to refer to, feel free to check out my GitHub repo for this project; or if you encounter any issues, feel free to file an issue at the repo or ping me up on LinkedIn if you have any new ideas!

Credit

Brajesh Mohapatra for the dataset
Streamlit team for the Streamlit Share feature