Team: Sefa Alp and Muhammed Atakan Pehlivanoğlu
Year: Spring 2020
Description: We make important meetings with our boss, teachers, or partners occasionally. These meetings can happen for solving certain problems or drawing a new roadmap for the project that we are working on. But important details of these meetings are forgotten in several days and the working process can be staggered due to missing information. In this project, we focused on this problem and trying to find out an easier and effective method than taking notes. Earwig occurred in this manner. In its simplest form Earwig first takes speaker's voice records separately for creating training. Using this voice records program to train itself for each speaker. Then we record dialogue or meeting to the program. And output program gives the sentences as voice record that each person in the speech used according to their speaking order.
More Information: Project report
Team: Göktuğ Yıldırım
Year: Spring 2020
Description: The goal of this project is to develop a GUI (Graphical User Interface) that can analyze the given speech signals using digital signal processing techniques. Signal processing is one of the most important topics in systems engineering and electronics engineering. In general, analysis on analog and digital signals can be defined as application to various systems by detecting temporal and spatial changes. In relation to that, speech processing is one of the subtitles of the signal processing. Although MATLAB is one of the most popular signal analyzing tools available today and it provides a detailed environment for general signal processing and analyzing tasks, it does not have a user-friendly interface for sound analyzing because MATLAB is not just a customized tool for speech processing. On the other hand, WaveSurfer is a widely used audio editor for acoustic-phonetic studies, but according to my observations, it has not appropriated enough user-friendly interface for basic speech analyzing tasks. It has lots of settings properties. So, a large amount of settings may be overwhelming to a novice user. In this project, loudness, pitch, formant frequencies were used as acoustic features to develop this tool. Thanks to this developed tool, novice users will be able to analyze the speech signals by observing waveform, Fast Fourier Transform (FFT), Short-Time Log Energy, Short-Time Zero Crossing, Short-Time Fourier Transform, Short-Time Real Cepstrum, spectrogram, formant frequencies, pitch changing easily. As a result of this project, the developed tool will provide an efficient speech analysis environment for the novice users especially undergraduate students in electrical and electronics engineering.
More Information: Project report
Team: Göktuğ Kayacan, Remzi Orak and Sefa Alp
Year: Spring 2019
Description: This project’s main goal is to implement a phoneme recognition system using Mel Frequency Coefficients (MFCC) to distinguish between different phoneme using a parametric classification model such as a feedforward neural network or a recurrent neural network. The system should be able to produce outputs for a given set of MFCC features of a frame with the size of the frames that are used to train the model. The result from the network should be a list of label possibilities for the given features.
More Information: Project report
Team: Sena Koyuncu, Mustafa Can Gülbaş and Mehmet Taylan Eğer
Year: Spring 2019
Description: In multimedia applications, discrimination of silence, music, and speech plays an important role. This project discriminates the regions that contain silence, music and speech in a given audio file. Project is implemented on MATLAB using temporal features such as short-term energy and log energy, which are simpler to obtain compared to spectral features. Firstly, silent regions are detected using an end point detection algorithm. Then examining the log energy; it is seen that the audio signal continues above a certain threshold in regions contain music, but regions with speech does not have such regularity due to breathing pauses. Using this finding, speech and music regions are discriminated as well and region classifications are shown on a graph.
More Information: Project report