Rendering Responsive Text on Video using Python


Hands-on Computer Vision Project with OpenCV

In this post, I will show you how to render responsive text on video using Python. We will use the OpenCV python package for the whole project. This may sound a little difficult to do with just OpenCV, which is a computer vision package, but in this project, you will see that it is also good with video editing.

I did a similar project to this one, which can be found here. In that article, I showed how to add static text on video. But this time, I decided to spice things up a little by adding an updatable responsive text, and not use a video editing library like MoviePy.

I am hoping that you will enjoy this project. Let’s get started!

Table of Contents

  • Introduction
  • Step 1 — Importing the Video
  • Step 2 — Reading the Text Document
  • Step 3 — Responsive Text Function
  • Final Step — Rendering the Video

Introduction

As I mentioned earlier, we will be using just the OpenCV package in this project. And it is a super cool computer vision library. When working on a machine learning or a computer vision project like this, we will need to install the package first. This way, we will have the power to use it. Let me share a little definition of OpenCV, and then we install it.

OpenCV (Open Source Computer Vision Library) is an open-source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products. Being a BSD-licensed product, OpenCV makes it easy for businesses to utilize and modify the code.

https://opencv.org

Now, let’s install it using PIP. Feel free to search “How to install OpenCV to Windows or macOS” if you want to know more about the installation. By the way, make sure Python is already installed before running this line in your terminal.

pip install opencv-python

After the installation is concluded, we can move ahead to our text editor and start programming. I don’t recommend using Jupyter Notebook for this project; since we will be creating a video player running on a new window, it is easier to open and close windows this way. Otherwise, the program may freeze and not respond.

Here is my first line in the text editor. I am importing the OpenCV package that we just installed into the program using the import method.

import cv2

Perfect! Now we can move to the next step, where we will choose a video and import it into the program.


Step 1 — Importing the Video

This will be an easy step. We will do two things: import a video and find the frame rate of the video. We will use the frame rate to calculate the seconds in the video so that we can calculate the rendering duration of different texts.

Let’s import the video using the VideoCapture method. Here is the official reference page if you want to learn more about VideoCapture.

tree_video = cv2.VideoCapture('tree.mov')

And, now let’s calculate the frame rate of the imported video.

fps = tree_video.get(cv2.CAP_PROP_FPS)
print(fps)
#result
23.97




I got 23.97 printed on the terminal when I run this line. Which means that the video is basically running 24 frames per second. So 24 times of the capture loop is equivalent to 1 second the video. I will share more about how I did the calculation on the Responsive Text Function step.


Step 2 — Reading the Text Document

In this step, we will read text from a text document. We will use the open function, which is a built-in function that comes with Python. Here is a screenshot of the text document that I will be importing.

image by author

These lines are from a poem that I wrote, called “Under the Giant Tree”.

Make sure each sentence is on a new line. When reading the text document, we will convert each line into a list item.

poem = open('under_the_giant_tree.txt')
poem_lines = list(poem)
print(poem_lines)
image by author

Step 3 — Responsive Text Function

In this step, we are going to write a function that we will use to update the text. I decided to update the text according to different timestamps, but feel free to change it to a different cases.

frame_ = 0

def text_update(frame_):
 if frames_ < (5*fps):
  text = str(poem_lines[2])
 elif frames_ < (10*fps):
  text = str(poem_lines[4])
 elif frames_ < (15*fps):
  text = str(poem_lines[6])
 elif frames_ < (20*fps):
  text = str(poem_lines[8])
 else:
  text = "no text found"
 return text

I am using if-else to understand which timestamp of the video is playing. It could be done with switch cases too. Feel free to play around with the code.

I have defined a new variable called “frame_” to pass in the video frame. This way, I can calculate the duration in seconds. We already know the frame-per-second from the first step, which was 24 fps. In this function, the text will be updated every five second. I am assigning the lines of the poem to the text variable.

Now, let’s move to the final step.


Final Step — Rendering the Video

Great! We are almost done. In this final step, we are going to combine everything we did so far. We are going to use a while loop to trigger the program. And we can end the loop using “escape” or “q” key.

while(True):
ret, frame = tree_video.read()
font = cv2.FONT_HERSHEY_SIMPLEX

on_video_text = text_update(frame_)
cv2.putText(frame, on_video_text, (50, 50), font, 1, (0, 255, 255), 2, cv2.LINE_4)

frame_ = frame_ + 1

cv2.imshow('poem on video', frame)

if cv2.waitKey(1) & 0xFF == ord('q'):
    break

tree_video.release()
cv2.destroyAllWindows()

So what is happening in the code above:

  • We start by reading the imported video recording.
  • And then, we define the font that we want to use for our text.
  • Then, we are calling the text_update function to update the text.
  • We use putText method to add our responsive text on the video. Here is the official reference link where you can learn more about putText.
  • After that, we are updating the frame variable so that we pass in the right frame into our text_update function.
  • To display the video on a new window we are using imshow method.
  • And finally, an if statement to end the while loop. The program will stop running when “escape” or “q” key is pressed. If nothing is pressed, the window will close when the video ends.

Here is a screenshot of the video playing after I run the program:

image by author

Congrats!! We have learned how to render updatable responsive text on video using Python. I hope you enjoyed this hands-on computer vision project. Working on hands-on programming projects is the best way to sharpen your coding skills. I am so proud if you learned something new today.

Feel free to reach me if you have any questions while implementing the code. I do my best to get back within two weeks.

Let’s connect. Check my blog and youtube to stay inspired. Thank you,


More Computer Vision & Machine Learning Related Projects

Building a Face Recognizer in Python
Step-by-step guide to face recognition in real-time using OpenCv library

Extracting Speech from Video using Python
Simple and hands-on project using Google Speech Recognition API

Leave a comment