Publish AI, ML & data-science insights to a global community of data professionals.

Building Web Applications with Streamlit for NLP Projects

A code demonstration and explanation of the basics of Streamlit while solving NLP tasks such as Sentiment Analysis, NER, and Text…

Hands-on Tutorials

Image from Unsplash
Image from Unsplash

One of the most common tasks Data Scientists struggle with is presenting their model/project in a format for users to interact with. The model is not of much use to any external users if it is not presentable in some type of application. This introduces the vast-field of web/app development which then leads to more languages and tools such as HTML, CSS, ReactJS, Dash, Flask, and more that can help you create a front-end interface for users to interact with and get results out of your model. This vast field can be intimidating at first and alien to the traditional Data Science skillset.

Streamlit is a simple Python API that provides a way to lessen this learning curve and eliminates the need for Data Scientist’s to know many of these tools. The general full-stack arsenal should be the go-to for large scale projects, but if you ever need a quick front-end for your data science projects, Streamlit more than serves the purpose. For this article, I wanted to demonstrate the basics of Streamlit while building a web application that allows users to work with models solving common NLP tasks such as Sentiment Analysis, Named Entity Recognition, and Text Summarization.

NOTE: While we could build custom models for each of these tasks, I went with pre-trained models in Python libraries as this article is more centered around building a web application to display your ML/NLP projects through Streamlit.

Table of Contents (ToC)

  1. Web App Setup
  2. Data
  3. Sentiment Analysis
  4. Named Entity Recognition (NER)
  5. Text Summarization
  6. Entire Code & Conclusion

1. Web App Setup

Before building any of our models, we need a template for how our application is going to look. Let’s first import streamlit (To install use pip3 install streamlit for Python3).

To explain what our application is, let’s add a title and smaller header asking which NLP service the user wants to work with from our three options. Streamlit.title(), Streamlit.header(), and Streamlit.subheader() are the different heading levels in descending order. For pure non-heading text data you can use Streamlit.text().

Result of heading code with Streamlit
Result of heading code with Streamlit

Now that we have the main question for the application, we can build a drop-down menu for the user to pick from the three tasks our models can complete. This can be done using the Streamlit.selectbox() call.

Drop-down menu for NLP Service to work with
Drop-down menu for NLP Service to work with

The service chosen out of the three choices is stored as a string in the option variable. The next piece of information that needs to be entered is the text that the user wants to enter for our models to process/perform the chosen task on. For this we can use the Streamlit.text_input() call, which is similar to the textarea tag in HTML.

Text area for user to enter in input for models
Text area for user to enter in input for models

The text input is then stored in the text variable. Now that we have the option and text variables containing all the information we need to work with, we need to make an area to display the results.

With our template now set-up we have something that looks like what’s below.

Overall Template made with Streamlit
Overall Template made with Streamlit

With just about 10 lines of Python code a very simple yet effective template has been created. Now that we have our front-end we can work with processing/feeding this information to our models and returning the results to the user.

2. Data

For the NLP tasks we will be using the following piece of text data from an article titled "Microsoft Launches Intelligent Cloud Hub To Upskill Students In AI & Cloud Technologies". The article/text was found through the medium article linked. We will use the following text block as the input text for all three of the NLP tasks.

In an attempt to build an AI-ready workforce, Microsoft announced Intelligent Cloud Hub which has been launched to empower the next generation of students with AI-ready skills. Envisioned as a three-year collaborative program, Intelligent Cloud Hub will support around 100 institutions with AI infrastructure, course content and curriculum, developer support, development tools and give students access to cloud and AI services. As part of the program, the Redmond giant which wants to expand its reach and is planning to build a strong developer ecosystem in India with the program will set up the core AI infrastructure and IoT Hub for the selected campuses. The company will provide AI development tools and Azure AI services such as Microsoft Cognitive Services, Bot Services and Azure Machine Learning.According to Manish Prakash, Country General Manager-PS, Health and Education, Microsoft India, said, "With AI being the defining technology of our time, it is transforming lives and industry and the jobs of tomorrow will require a different skillset. This will require more collaborations and training and working with AI. That's why it has become more critical than ever for educational institutions to integrate new cloud and AI technologies. The program is an attempt to ramp up the institutional set-up and build capabilities among the educators to educate the workforce of tomorrow." The program aims to build up the cognitive skills and in-depth understanding of developing intelligent cloud connected solutions for applications across industry. Earlier in April this year, the company announced Microsoft Professional Program In AI as a learning track open to the public. The program was developed to provide job ready skills to programmers who wanted to hone their skills in AI and data science with a series of online courses which featured hands-on labs and expert instructors as well. This program also included developer-focused AI school that provided a bunch of assets to help build AI skills.

3. Sentiment Analysis

For Sentiment Analysis, we’ll be working with the textblob library to generate polarity and subjectivity of text entered. (Use pip3 install textblob for Python3)

For tokenizing the data we use the NLTK library to split the text into sentences so we can break larger text into smaller portions for visual analysis. Using the textblob library we get a sentiment per sentence to create a visual plot of the sentiment throughout the text. Note: Sentiment is represented between -1 to 1 for the library with 1 being most positive and -1 being most negative.

Using another neat Streamlit feature in Streamlit.line_chart(), we can plot the sentiment of each sentence in a line graph for user to visually analyze the text data shown in the second section.

Plot of Sentiment Score per Sentence in Text
Plot of Sentiment Score per Sentence in Text

If we want an overall sentiment off the text, we can use the TextBlob API call on the entire data string.

Using the st.write() call we can display the sentiment returned from textblob. The two features returned are Polarity and Subjectivity. Polarity is between [-1,1] and Subjectivity is between [0,1] with 0.0 as objective and 1.0 as subjective.

Overall sentiment of text inputted
Overall sentiment of text inputted

4. Named Entity Recognition (NER)

For NER we’ll be using another popular NLP library in Spacy.

Spacy has a nice feature where it provides the type of entities along with the entities it recognizes. Examples of this includes: PERSON, ORG, DATE, MONEY, and more. To extract entities and their labels we can iterate through the entire list of entities found and join them into a dictionary.

For our user to get an easy visual representation of entities of each type, we can make a helper function that lists all entities of each type.

Using this function we can extract entities for each type and write the results out for the user to cleanly see.

Entities of each type from text data
Entities of each type from text data

5. Text Summarization

For text summarization, we’ll be using the gensim library which has a simple summarize call.

Using the summarize call, we easily generate a summary of the inputted text. Note: Need to have more than one sentence for the summarize call to work with the gensim.

Summarized text from input
Summarized text from input

6. Entire Code & Conclusion

RamVegiraju/NLPStreamlit

Streamlit allows you to seamlessly integrate popular Data Science/ML libraries to display your projects on simple yet concise web application/dashboards. While the more common full-stack skillset will serve better for projects at scale, the short time and simplicity needed to use Streamlit greatly aids Data Scientists & ML Engineers trying to showcase their projects.

I hope that this article has been useful for anyone trying to work with Streamlit with their ML/NLP projects. Feel free to leave any feedback in the comments or connect with me on Linkedln if interested in chatting about ML & Data Science. Thank you for reading!


Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles