Hands-on Deep Learning Project
In this article, we will learn how to convert a product review video into automated product feedback using programming. We will use the power of machine learning and some python libraries. Watching product review videos can be time-consuming most of the time. Especially when you watch your 10th video about a product you want to purchase.
I know exactly how that feels — I am doing the same thing. Before buying a valuable asset, I usually watch many review videos by content creators on YouTube. I want to make sure that I am buying the right thing matching my criteria. This product can be a simple cat feeder, a high-tech laptop, or maybe even a self-driving car.
Different opinions about a product can be helpful before making a purchase. We get to see how it looks and functions in real life; we also listen to people who have some experience with the product. Long story short, the process of watching so many videos is draining. That’s where the thought of writing an automated product feedback program came in.
Hoping that you’ll enjoy this project, let’s get started!
Table of Contents
- Getting Started
- Step 1 — Libraries
- Step 2 — Download Product Review Video
- Step 3 — Audio Transcription with Analysis
- Final Step — Check the Results
Getting Started
For this project, we are going to use AssemblyAI’s Speech-to-text API. Instead of reinventing the wheel, we will save time and energy by using an online cloud platform. The Speech-to-text API also gives us some powerful functions to use while doing the transcription, such as topic detection. We will go deeper into this in the third step of this article.
The API offers a free trial where we can have some idea of its capabilities. After creating an account, we will have access using a unique API key. This key will be the bridge between our personal computers and the cloud.
Step 1 — Libraries
We are going to need multiple python libraries for this project. Some of these libraries are built-in, which means they are already installed with Python. And some are third-party libraries, which means we have to install them before using them. This also depends on the version of the Python.
Here is the list of libraries we need:
- requests
- sys
- time
- json
- os
- urllib
- youtube-dl
Among these libraries, youtube-dl has to be installed. Youtube-dl is a library used to download videos in different formats. We will need it to download a product review video. Here is the Github repository for this package.
We will use the PIP python library manager to install it. It’s so simple; here is the code to run in the terminal window:
pip install youtube-dl
After the installation is completed. We can go ahead and create a new Jupyter Notebook. Then, we will import them into our program.
import requests
import sys
import time
import json
import os
import urllib.request
import youtube_dl
Step 2 — Download Product Review Video
It’s time to choose the product review video. I picked Marques Brownlee’s M1 Max Macbook Pro review.
We will assign the link of the video into a variable.
vid = "https://www.youtube.com/watch?v=rr2XfL_df3o"
Setting the parameters for the download format of youtube-dl:
ydl_opts = {
"format": "bestaudio/best",
"postprocessors": [{
"key": "FFmpegExtractAudio",
"preferredcodec": "mp3",
"preferredquality": "192",
}],
"outtmpl": "./%(id)s.%(ext)s",
}
And here is a simple function to get the video:
def get_vid(id):
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
return ydl.extract_info(id)
Let’s go ahead and download the video as mp3 audio file.
mp3_file = []
try:
vid.strip()
meta = get_vid(vid)
mp3_file.append({"id": meta["id"]+".mp3", "title": meta["title"]})
except:
print(vid, " could not be downloaded")

Perfect! Our product review is ready. It should be downloaded as an audio file inside the same folder as the jupyter notebook.
Let’s get to the next step, where real magic happens.
Step 3 — Audio Transcription with Analysis
Here comes the fun part!
We will convert the audio recording into text format. And not just that — we will also get an analysis report of the whole text. In this analysis, we will see the main points the reviewer talks about in their video.
Let’s start by uploading our audio file to AssemblyAI’s cloud. The transcription will occur in the cloud, so no CPU power or RAM memory will be used during this conversion. Isn’t that cool?
Read audio function
def read_audio(audio_data, chunk_size=5242880):
with open(audio_data, 'rb') as _file:
while True:
data = _file.read(chunk_size)
if not data:
break
yield data
Uploading Audio to the Cloud
audio_data = mp3_file[0]['id']
headers = {
"authorization": "API Key goes here."
}
response = requests.post('https://api.assemblyai.com/v2/upload', headers=headers, data=read_audio(audio_data))
print(response.json())
When we print out the response, we will see the URL address of the uploaded audio recording.

Speech-to-Text with Features
Now, we will do the transcription with two features. One of them is called Auto Chapters, which divides the audio into the spoken topic and creates summaries of each division. Cheers to the power of deep learning!
And the second feature I’ve turned on is the IAB Categories. It returns to the overall topic of the whole audio. There are 698 topics that API is already trained to predict. This can be helpful, especially if we want to run this product review program in bulk on multiple video links.
speech_to_text_api = "https://api.assemblyai.com/v2/transcript"
data = {
"audio_url": "The upload URL from the previous step goes here.",
"auto_chapters": "TRUE",
"iab_categories": "TRUE",
}
headers = {
"authorization": "API Key goes here.",
"content-type": "application/json"
}
response = requests.post(speech_to_text_api, json=data, headers=headers)
print(response.json())

As we can see below, our speech-to-text request is queued. The id key is our request-id. We will need that to check the request’s status and get the results.
Final Step — Checking the Results
We can check the status of the request by running the following code block:
request_url = "https://api.assemblyai.com/v2/transcript/ request id goes here."
headers = {
"authorization": "API Key goes here. "
}
response = requests.get(request_url, headers=headers)
print(response.json())
When the request is completed, we will see something like this:

There are so many keys-values in this dictionary. It can be challenging to find what we need, but we can access specific values by doing some basic filtering.
For example, auto chapters results are under the chapters section.
Auto Summary Report
auto_summary_report = response.json()['chapters']
auto_summary_report

So, what are your thoughts?
Our program has created short summaries of each section discussed within the audio. Reading these couple paragraphs gives us a clear picture of the product and what is covered in the product review video.
Before we conclude this tutorial, let’s look at the predicted topics.
Predicted Topics Report
topics_report = response.json()['iab_categories_result']
topics_report

AssmeblyAI’s deep learning model has predicted some topics just from one sentence. As we can see above, the result that has the best score is Laptops under Technology & Computing.
Conclusion
Congrats! In this hands-on tutorial, we learned how to build a program that creates automated feedback about products from videos using Python. This will save us so much time! These kinds of projects are a great application of machine learning and artificial intelligence in our daily lives. Hoping that you enjoyed this project and learned something new today.
Feel free to contact me if you have any questions. Ty,
Some programming posts you might like: