This project is a text-to-speech application that converts the contents of a PDF file into an audio file. In addition to performing the conversion, it also provides built-in media controls, allowing users to play the audio without needing to open it in a separate program.
The application uses the pdfplumber Python package to extract text from the PDF, which is then saved to a temporary .txt file. That text is processed using Google's Text-to-Speech API, which generates the audio output.
To keep the codebase organized and maintain a clear separation of concerns, I used object-oriented design principles. The program is structured into classes: one for the speech synthesiser and one for the graphical user interface (GUI). The GUI class interacts with the synthesiser to perform the conversion and control playback.
The interface was built using Tkinter, which I found to be a suitable choice for a lightweight, standalone application.
This project was especially rewarding because it allowed me to: