This Project addresses a key limitation in speech segmentation for end-to-end speech translation systems. The baseline method, while effective, relies on a "single-scale" approach, using only the final output layer of a pre-trained wav2vec 2.0 model. This ignores the rich phonetic and acoustic information present in earlier layers, leading to less precise boundary detection and sub-optimal segment quality.
To overcome this, I developed MSHA, a novel Multi-Scale Hierarchical Attention model. Instead of a single layer, MSHA extracts features from multiple, diverse layers of the wav2vec 2.0 encoder. It then intelligently fuses these features using a hierarchical attention mechanism, which learns to dynamically weigh the importance of phonetic versus semantic information for each audio frame. Furthermore, the model is trained with a boundary-aware focal loss to improve its performance on the rare but critical boundary frames.
Experimental results demonstrate a significant improvement over the baseline. MSHA achieves a 2.1% higher F1-score in segmentation accuracy, which translates to a 0.31-point increase in the final BLEU score for translation. It also produces more natural, sentence-like segments (5.35s vs. 1.43s average length) with only a minor trade-off in inference time. Ultimately, this research proves that leveraging a richer, multi-scale feature set is a superior strategy for creating high-quality speech segmentation, directly enhancing the performance of downstream speech translation tasks.
This project presents an advanced medical waste management system that integrates deep learning-based waste classification, robotic automation, autonomous navigation, and UV-C disinfection to address challenges in efficiency, safety, and sustainability in healthcare environments. The EfficientNetB0 classification model, fine-tuned on a hybrid dataset, achieved an accuracy of 81.72 percent and an AUC score of 0.8713, reliably categorizing waste into four types: glass, metal, plastic, and paper. A 4-DOF robotic arm powered by MG996R servo motors performed precise waste segregation with over 95 percent positioning accuracy and an average task completion time of 3.2 seconds per item. Autonomous navigation, supported by a multi sonar sensor array, achieved an 85 percent navigation accuracy and a collision-free rate of 97 percent, enabling efficient and dynamic movement in real-world environments. Additionally, a UV-C disinfection module with 254 nm LED strips reduced microbial contamination by 98 percent, ensuring a sanitized workspace. This fully automated system streamlines waste classification, segregation, and disinfection into a single cohesive process, significantly improving upon traditional manual methods. The design emphasizes energy efficiency, scalability, and adaptability for various healthcare settings. Future developments include expanding dataset diversity, optimizing system components, and integrating real-time monitoring capabilities. This work establishes a foundation for scalable and sustainable medical waste management, promoting safer and cleaner healthcare environments.
In this project, we designed the floor plan of a 11-storied, 3-unit residential building and laid out the conduit and electrical fittings. Next, we created a switchboard connection diagram that illustrates how the incoming electricity is dispersed around the apartment complex. The single line diagrams display various wire schedules and safety devices like circuit breakers in addition to the overall connectivity. We have designed the required lightning protection system to shield the building from electrical surges brought on by lightning strikes. As a result, we have practical experience with residential construction electrical service design.
In recent years, the proliferation of Internet of Things (IoT) technology has revolutionized various aspects of daily life, including security systems. This abstract presents the design and implementation of an IoT-based low-cost CCTV and alarm system tailored for residential and small business use. The system integrates affordable hardware components such as Raspberry Pi single-board computers, low-cost IP cameras, motion sensors, and sirens, leveraging their compatibility with IoT protocols. By utilizing open-source software platforms like Node-RED and MQTT, the system achieves seamless communication and integration between its various components Key features of the proposed system include real-time monitoring and remote access capabilities via a user-friendly mobile application. Users can receive instant notifications on their smartphones in the event of suspicious activities detected by the motion sensors or unauthorized access attempts captured by the CCTV cameras. Additionally, the system supports two-way audio communication, allowing users to interact with visitors or potential intruders remotely. The low-cost nature of the system makes it accessible to a wider range of users, including those with budget constraints. Furthermore, the modular design facilitates scalability and customization, enabling users to expand and tailor the system according to their specific security needs. Overall, the IoT-based low-cost CCTV and alarm system presented in this abstract offers an affordable yet effective solution for enhancing security and surveillance in residential and small business environments. With its user-friendly interface, remote accessibility, and real time alerts, the system provides peace of mind to users by offering a comprehensive security solution at a fraction of the cost of traditional surveillance systems.
In In this project, we have explored the application of deep learning techniques for the classification of skin cancer, an increasingly vital area in medical image analysis. our study on skin disease classification utilizing deep learning models has yielded promising results across various datasets. Through rigorous experimentation, we have discovered that the ensemble learning approach, particularly combining EfficientNetB0 and VGG-19, has proven to be highly effective in achieving the highest accuracy on the ISIC 2019 dataset. Moreover, our exploration into other datasets such as HAM10000 and Dermnet has revealed that a concatenation strategy involving EfficientNetB0 and ResNet-34 demonstrates exceptional performance, approaching the accuracy levels reported in seminal studies. Our results hold significant implications for clinical practice, offering potential tools for dermatologists to enhance diagnostic accuracy and improve patient outcomes. Looking ahead, our research sets the stage for continued exploration into refining ensemble learning methodologies and exploring innovative model fusion approaches to address the intricacies of skin cancer classification. Furthermore, efforts to enhance the generalizability of our models across diverse datasets and real-world clinical scenarios remain imperative. However, our project also revealed challenges inherent in the Dermnet dataset, primarily related to image quality. Despite preprocessing efforts, the presence of low- quality images has posed limitations on the overall classification performance. This underscores the importance of data quality in deep learning-based medical image analysis tasks. Ultimately, we envisage that our findings will contribute significantly to ongoing endeavors aimed at harnessing the potential of deep learning for more effective and precise diagnosis and management of skin cancer and other skin diseases.
The goal of this project is to build an Eco-Friendly Autonomous Medical Equipment Transporter that will transport various types of medical equipment from one room of a medical center to another. An autonomous medical equipment transporter is a specialized device designed to transport medical supplies or equipment within a healthcare facility without human intervention. It has many important features. It will be assigned to a medical facility or healthcare facility. It can then transport medical equipment to the desired destination. The transporter will follow the user-selected path to the destination. Therefore, it can distribute many important things such as medicines, injections, blood, oxygen cylinders etc. in every room and bed in the hospital. By reducing reliance on manual transportation methods, workflows are optimized, improving efficiency and allowing healthcare professionals to focus on caring for critical patients. There will also be eco-friendly features such as biodegradable materials and a solar panel for electricity supply.
The world is moving towards greener energy and more efficient ways to reuse wasted energy. One huge source of energy wastage is the heat generated during braking. In our project, we will model energy recovery from the back emf of a BLDC motor using regenerative braking techniques. Regenerative braking is an energy recovery mechanism that slows down a moving vehicle or an object by converting its kinetic energy into a form that can be either used immediately or stored until needed. The braking mechanism converts the kinetic energy to a Back EMF which runs the motor as a generator and recharges the battery. This recharging of the battery is modeled by braking a BLDC motor to charge a capacitor bank in our project. The project utilizes PWM techniques to control the speed of a BLDC motor and stores a portion of the energy wasted during the braking of such motors. The objective is to make energy efficient braking systems.
Security guards falling asleep while on duty has always been a major source of concern regarding the security protocols on BUET premises. Our project presents a viable solution to this problem based on machine learning, which takes the audio recording of the breathing of the security guards and determines whether the person is awake or asleep and proceeds to act accordingly. The performance of the model was evaluated based on minute-long breathing recordings of different students. The recordings were taken in two physical states of the subjects, namely awake and asleep states. Our developed model has a remarkably high accuracy rate, which only improves its credibility.
The Password-Based Bank Vault Security System is an innovative solution designed to fortify the security of bank vaults and protect valuable assets. It is a robust authentication mechanism designed to safeguard access to high-security bank vaults. The system employs a multi-layered approach, integrating advanced encryption algorithms and biometric authentication, ensuring utmost protection against unauthorized entry. It features a secure password management system, utilizing industry-standard encryption protocols to safeguard sensitive information. Real-time monitoring and logging of access attempts enable immediate response to suspicious activity. Different user roles with specific privileges are defined, ensuring that only authorized personnel have access to critical areas. Overall, the Password-Based Bank Vault Security System represents a substantial leap forward in bank vault security, providing a comprehensive and adaptable solution for safeguarding valuable assets. This system offers a seamless, yet formidable, defense against unauthorized access, meeting the stringent security requirements of modern banking institutions.
In this project we have used an IR LED inside its circuit as transmitter, which emits infrared light for every electric pulse given to it. This pulse is generated as a push button is pressed, thus completing the circuit, providing bias to the LED. The LED on being biased emits light of the wavelength of 940nm as a series of pulses, corresponding to the push button pressed. However since along with the IR LED many other sources of infrared light such as us human beings, light bulbs, sun, etc, the transmitted information can be interfered. A solution to this problem is by modulation. The transmitted signal is modulated using a carrier frequency of 38 KHz. The IR LED is made to oscillate at this frequency for the time duration of the pulse. The information or the light signals are pulse width modulated and are contained in the 38 kHz frequency. The receiver consists of an IR receiver (TSOP1738) which develops an output electrical signal as light is incident on it. The TSOP1738 is an IR Receiver with the capability to demodulate signals that have been modulated at a frequency of 38 kHz. The output of the detector is filtered using a narrow band filter that discards all the frequencies below or above the carrier frequency (38 KHz in this case). The filtered output is then given to the Microcontroller controls devices.