System Overview
Minerva is a python application running alongside presentation software (e.g. Zoom) captures and analyzes presentation video.
UI lets user interact with analysis and presentation information.
End Result
Starting off with the first picture, it is the intro page where you select the presentation area and domain. While our focus domain was computing, we have other domains such as biology and chemistry.
The picture below shows the following page where you would see all the different functions Minerva has. This window would open next to the window where you would be watching the presentation, this way you can view the various information while looking through the presentation at the same time.
Try It Yourself
Development
Backend
What does the backend do?
Remove all common words from list.
Find root words of any words in non-simple form.
Use wiktionary and the domain given by user to find relevant definition of word.
Send definition back to frontend.
Take coordinates of presentation on the screen from the frontend.
Use Pyautogui to take screenshot using coordinates
Use pytesseract to parse the screenshot for words.
Create list of all the words in the screenshot.
Tech Stack
Pyautogui: To take screenshots.
PyTesseract: To get words from screenshot
NLTK stemmer: To find root words.
Wiktionary Parser API: To find relevant definitions.
Challenges
NLTK stemmer does not always give the correct root word.
Eg. returns the root word for troubling as troubl
Sometimes strings that are parsed are proper nouns (such as names, etc).
Different screen resolutions may cause issues.
Frontend
Tech Stack
Python Flask/Jinja2 (Server-Side)
Javascript (Client-Side)
HTML/CSS
Bootstrap Framework
Challenges
Initial Highlighting System
Obtaining and Transmissions of Coordinates
Integration of backend python scripts
Updating user interface with parsed words and definitions