In this post I will show you a cool Python program that will help you extract text from an image and convert to a different language that you prefer and save them as a text document.
Make your own OCR (Optical Character Recognition) system by using Python.
Programming is fun, let’s get started!
First and foremost, let’s add the libraries we will need.
# adds image processing capabilities from PIL import Image # will convert the image to text string import pytesseract #translates into the mentioned language from googletrans import Translator
Yes, that’s all you need, just three libraries. If you haven’t installed the libraries you should google: “how to install x library in python”. Pip install works for most cases but it may change depending on the library you are planning to install. Let me know in comments if you have any issues.
Secondly, let’s download an example image from internet that we can test our code. I download a short poem by Fitzgerald. Let’s give our code some literature touch 🙂
Here is the link to the picture I download to test the code: Link
Now, we will assign the image file to a variable and print it out the variable.
# opening an image from the source path img = Image.open('image_text/images/book1.jpg') print(img)
Third step, let’s do the extraction of text from image. Thanks to pytesseract library.
# converts the image to result and saves it into result variable result = pytesseract.image_to_string(img)
Fourth step, let’s convert the result to any language we want. In my code, I converted it to french. Our translator uses Google’s Translate API.
p = Translator() # translates the text into french language k = p.translate(result, dest='french') #converts the result into string format translated = str(k.text)
Fifth and final step for this tutorial, we will save our results in text document. The following code create a document called “test1” and then writes the result and translated text into it. And then prints “ready!” at the end in your terminal.
with open('test1.txt', mode ='w') as file: file.write(result) file.write("\n") file.write(translated) print("ready!")
Yes, that’s all!
I hope you enjoyed creating your own OCR and use it in your daily life.
Personally, I am using the one I created to convert my book pages into text format, it helps me take notes easily and faster. Instead of writing down my favorite lines, now I just take a picture then convert them into text format using the program I’ve created. That’s the cool part of coding. Create something that will solve a problem in your life.
In my next post, I will write a post on how to convert text to speech, which is also super cool. Please follow my blog to learn more about python and deep learning.
Thank you,