Research Projects

International Project EDDA (2021 -2024)

The analysis of multi-dimensional time series is a fundamental problem in most areas of science and industry. Often, linear models are insufficient to capture the structure present in data. We propose to develop and improve techniques for this problem based on a mathematical object known as the iterated-integrals signature (IIS). Equipped with mathematical guarantees, the IIS is a means to extract (almost all) multilinear features of a time series. It is hence, at least on paper, well-suited to discover non-linear effects in data. Recently, this suspicion has been corroborated by a series of works that successfully apply the IIS in the realm of data science and statistics.

Our inter-disciplinary team of researchers from machine learning, algebra, stochastic analysis, data assimilation and oceanography, aims to :

  • develop interpretable features of multi-dimensional time-series in a rigorous algebraic framework based on the IIS, for the analysis of dependence, synchronization and structure

  • understand how to extract these features in a robust fashion develop statistical guarantees for these features in the setting of standard time-series models and benchmark on synthetic data

  • use these new - as well as existing - statistical methods to perform original investigation on oceanic and climate data


Project ANR "Pro-TEXT (ANR AAPG 2018_2021)"

It is a study of textualization processes from linguistic modeling, psycho-linguistics and machine learning. This work is in collaboration with the laboratory of computer science of Paris Nord (LIPN: teams RCLN and A3) , the laboratory of Clesthia of University of "Sorbonne nouvelle" and the CERCA of University of Poitiers. My work in this project is the development of unsupervised and semi-supervised machine learning approaches for textual data analysis.


Project CIFRE (2015_2018)

The objective of this work was to design innovative methods of unsupervised learning for complex data. In the proposed approaches the entire learning process is adapted in real time to the evolution of the data, which constituted the main challenge of this subject. This work was a collaboration between the laboratory of computer science of Paris Nord and MindlyTix Co.
