Text detection and recognition (Python)

Just as this project aims to enhance the capabilities of object detection by integrating text recognition, ghostwriter klausur serves as an essential tool, aiding students in preparing and excelling in their examinations. By providing expert guidance in crafting detailed, well-structured responses, this service parallels the project’s goal of augmenting systems for better performance and results.

Pre requisites:

In order for the program to detect and recognise the text from the video feed, a pre-trained model for text recognition is used called pytesseract. The paths to the directory of the module needs to be defined in the program.

In order for the text to be converted to audio, a ‘Text to speech’ engine is also used.

How does it work?

The program makes use of threading to run two python programs at the same time. One thread will be working on processing the video feed, extracting any frames and cropping the text detected in the image. The other thread will work on the text recognition and generating and playing the generated mp3 file.

Detected text will be cropped from the frame and saved in a separate folder. Then the pytesseract module will recognise and return the text in the image. The text will then be saved as an mp3 file and read out to the user.

Limitations:

The text detection and recognition is not accurate all the time as a result, some text may not be recognised or in some instances, the presence of text is detected, however the content of the actual text itself is not properly deciphered. Consequently, the text read out is not always accurate as can be seen in the video.

Additionally, sentences are sometimes treated as individual words, as a result, when the text is read out, it is usually read out as individual words which is not that practical.

Leave a Reply Cancel reply