Created using Python, including modules: PyQt5, hashlib(for SHA-256, MD5), CSV.
Outcome: Fully functional image hashing and detection tool with GUI, database, and documentation.
Hashing using SHA-256 and MD5.
Bulk and single-image import.
Preview images before adding.
Fully editable CSV database (internal GUI + external editing).
Duplicate detection.
Image lookup and match detection.
Clear GUI built with PyQt5.
Integrated help system.
A brief architecture explanation:
Python backend using hashlib for hashing.
PyQt5 for GUI and user workflow.
CSV-based hash database for transparency and compatibility.
Modular codebase (hashing module, database handler, GUI controller).
Secure design considerations to keep the tool safe and ethical.
The developed program has demonstrated great performance across all testing phases. The accuracy of the program aligns with our needs of the project, the tool consistently produced 100% matching results for the same images without variation. Both the MD5 and SHA-256 provided consistency and no collisions are present in my results. One concern for the project were false positives/negatives however, with the testing phase the program confirmed zero false positives or false negatives across 46 images (23 images hashed and verified three times).
Added this drop down for those who want to know more around my project(Didn't want the page too cluttered).
The program was designed using a modular architecture which promotes scalability and maintainability. It is run using a graphical user interface (GUI) built in PyQt5 and uses several components to handle tasks within the system.
Python 3.x : Python3 serves the core programming language offering a high level language to promote accessibility to the code and modifications where necessary.
PyQt5: Used to create our GUI, enabling user interactions and creating a responsive multi-tabbed application.
hashlib: hashlib is the backbone and the main library used in the project to make it perform correctly. hashlib was utilised within the project to calculate hashes for the images inputted. From the library, we specifically used its ability to calculate SHA-256 hashes and MD5 hashes.
csv: This library was used to implement a lightweight and easily portable and readable database. In the project csv was used to store our hashes (SHA-256 and MD5). Using the csv file promotes accessibility and portability but also enables easy scalability and portability.
shutil: shutil was used to copy selected images into the program's known_images directory when adding new images into the database. Shutil was used to promote moving images safely into the program folder when a new image is added to the database.
os: os was used to check our directories and generally to have access to the operating system and to navigate.
sys: sys was used to help mitigate resource drain and close the program off and handle the databases effectively.