Building a Speech Translator in Python

In this post, I will show you how to translate your speech into a different language using Python. Speech translator will record your speech then understand what you are saying, and translate it to the language you prefer. After the translation process is completed you can customize your code so that it says the translated text or you can just save the translated text as a text document. I will show you how to do both of them. This is a great project that you will enjoy building and even impress your friends from other nations 🙂

Imagine when you want to communicate with someone from a different country, you have to hire someone that can speak both languages and do all the translating. That was the case in old times, but after the development of the internet and technology, communication has become much and much easier. Now, we can use Google Translate to translate anything we want in a couple of seconds, isn’t that mind-blowing. YouTube is also doing this for some videos. They convert speeches in the video to different languages in real-time using artificial intelligence. This is amazing and worth sharing. So I want to show how you can do this using Python. This way you will have your personal translator that speaks many languages.

Let’s get started!

I would like to mention that in this project, we will be practicing multiple topics at the same time. This is a great way to combine your skills and create something better. Firstly, we will use a speech recognizer to teach our program to understand our speech, then we will convert it to text. Secondly, we will define a translator that will convert that text to our preferred language. Lastly, we will use a text to speech model so that our program can speak and say the words out loud.

Importing libraries

First, let’s install the modules that we will be using in this program. We will install SpeechRecognition, Pyttsx3, and Googletrans modules. SpeechRecognition and Googletrans modules are created by our friends at Google. All these modules are free to install and use, which is one of the great reasons why the programming industry has been developing so increasingly. Here is the code to pip install multiple modules at the same time:

pip install speechrecognition pyttsx3 googletrans

Yes, that was it. It’s super easy to install modules using pip. If you want to learn more about these modules, here are the links for their documentation pages.

Define the Recognizer

r = sr.Recognizer()

Define your Microphone

Before defining our microphone instance, we will choose our input device. There might be multiple input devices plugged into your computer and we need to choose which one we are planning to use. As you know machines are dummies, you have to tell them exactly what to do! Using the following code you will be able to see your input devices.

print(sr.Microphone.list_microphone_names())
list of microphone names




Here you can see the results of me checking the input devices. I recommend running this script before you define your microphone, because you may get a different result. The script returns an array list with input names, for me I want to use the “Built-in Microphone”, so the first element of the array list. Defining the microphone code will look as follows:

mic = sr.Microphone(device_index=0)

Recognize Speech

As mentioned earlier, we will be using the recognize_google method, which is a speech recognition model created by our friends at Google. Thanks to them!

with mic as source: 
  r.adjust_for_ambient_noise(source)
  audio = r.listen(source) 

result = r.recognize_google(audio)

If you want to check your result before doing the translation, you can add the following line to your code.

print(result)

Define your Translator

This is the fun part of this project. You will have a lot of options to choose from. If you want, you can do like me, close your eyes and think of a country that you wish you could visit, and check what language do they speak over there. Yes, that’s one way to choose from a bunch of different languages 🙂

Now you can put that language in the destination attribute, as shown below.

p = Translator()
k = p.translate(result, dest='french')

In the code below, we will convert the translated result into a text format, this will help our text to speech module to work properly. Because previous code, the result was stored as an object.

translated = str(k.text)
print(translated)

Define Text to Speech Engine

engine = pyttsx3.init()

We’ve just defined the module as an engine. Now is the time to tell our program to speak the translated text, but before that, we have to define the language. Here is the code to see the list of languages and their IDs, which we will need when we are defining the translator. I recommend running this code in your terminal before going to the next step.

engine = pyttsx3.init()
voices = engine.getProperty('voices')
for voice in voices:
print("Voice:")
print(" - ID: %s" % voice.id)
print(" - Name: %s" % voice.name)
print(" - Languages: %s" % voice.languages)
print(" - Gender: %s" % voice.gender)
print(" - Age: %s" % voice.age)

Define the speaker language

Copy the ID of the language that you want to use, and let’s paste it into our program. We are using setProperty method to define the speaker language.

fr_voice_id = "com.apple.speech.synthesis.voice.thomas"
engine.setProperty('voice', fr_voice_id)

Final Step: See the Magic Happen

engine.say(translated)
engine.runAndWait()

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s