Domains of interest:
Signal and image processing, Artificial intelligence, Handwriting recognition, Automatic speech transcription, Data fusion, Date mining, Big data, Decision functions, ...
Current research work (Postdoctoral work):
Under construction ...
Past research work (Initiated since i started my PhD thesis, 2010-2015):
During my PhD at Ecole polytechnique of the University of Nantes, I am used to work on graphical langages recognition in bimodal context. Now, as an assistant professor at IUT of Nantes, i am still performing my research within the "Institut de Recherche en Communications et Cybernétique de Nantes" (IRCCyN) laboratory. I am a member of the Images and Video Communications (IVC) team where i did my PhD.
Thus, my research work concerns the use of a bimodality aspect of the information to perform an automatic recognition of bi-dimensional languages (such as electrical circuit, charts, house plan, mathematical expression…). These languages are very interesting in the context of human-machine interactions. Any progress in the way that these languages are entered to a computer will have a great impact in a practical point of view since a lot of people are concerned.
The main goal of my research is to set up a bimodal system based on speech (recorded from a microphone) and online handwriting (acquired from a tablet PC, a digital pen and so one) signals. The information conveyed by both of these signals is merged using data fusion techniques. Acting in such a way, we aim to overcome the existing intrinsic weaknesses within each modality.
The obtained results, up to now, support the hypothesis of existing complementary between both modalities and suggest a lot of interesting tracks for the future work in order to further improve the actual system (cf. publications list).
PhD thesis abstract: Significant efforts are being done to make as natural as possible the way that human are interacting with their machines. Regarding this quest, a lot of research is being inspired by the most sophisticated machine ever known : human being and more precisely his use of the multi-modality aspect of the information to interact with his peers. The work reported here concerns the study, the conception and the validation of bi-dimensional structure recognition systems. The application considered here is the mathematical expression language which is one of the most interesting 2D languages. The system we proposed is original since it uses simultaneously two modalities to achieve its task. Indeed, both speech and handwriting streams are used by our system to perform the recognition in a bi-modal fashion. This procedure allows dealing with the ambiguities arising when mono-modal processing is used. This system exploits the existing complementarity between the modalities in concern and exhibits an improvement of the performances with respect to the case of a mono-modal processing using only handwriting modality. To set-up, train and validate our system we built HAMEX, a bi-modal database of mathematical expressions. This latter, is formed by 4350 mathematical expressions, each available in handwritten and audio forms and is fully annotated. I provide a comparative evaluation between mono-modal system based on handwriting alone (base line system) and the bi-modal system I proposed.
After the thesis: After defending my PhD, I keep working on the system V1 proposed for complete mathematical expression recognition in a bimodal context. Thus, I addressed the outstanding issues by following the perspectives I spotted on at the end of the thesis. In particular, I proposed a knowledge extraction from text analysis based method to solve handwriting and audio signal alignment problem. In the following table is given a comparison between the final system after finishing my PhD and the best actual system.