Louna Alsouki
Data scientist
PhD student in applied mathematics at
Claude Bernard University Lyon 1, Camille Jordan Institute;
in cotutelle with Saint Joseph University of beirut;
in collaboration with IFP énergies nouvelles
Address
Claude Bernard University of Lyon 1
Camille Jordan Institute
Bâtiment Braconnier
43 boulevard du 11 novembre 1918
69622 Villeurbanne Cedex
Office 131
News
Thesis defense
This event was held on Thursday June 15th, at 10 am, and organized at room Fokko du Cloux, Braconnier, la Doua campus and online via Zoom. A drink was arranged after the defense in the lab’s meeting room. An online link of the recorded presnetation will soon be available.
The thesis was defended publicly in front of the jury members:
Rasmus Bro, Université de Copenhagen, reviewer
Hervé Cardot, Université de Bourgogne, examinor
Caroline Chaux, CNRS, examinor
Gabriela Ciuperca, Université Lyon 1, examinor
Laurent Duval, IFPEN, co-director
Rami El-Haddad, Université Saint Joseph de Beyrouth, co-director
Céline Helbert, Ecole centrale de Lyon, invited member
Sophie Lambert-Lacroix, Université de Grenoble Alpes, reviewer
Clément Marteau, Université Lyon 1, thesis director
Dual Sparse Partial Least Squares
Dual-sPLS generalizes the classical PLS1 algorithm. It provides balance between accurate prediction and efficient interpretation. It is based on penalizations inspired by classical regression methods (lasso, group lasso, least squares, ridge) and uses the dual norm notion. The resulting sparsity is enforced by an intuitive shrinking ratio parameter. Dual-sPLS favorably compares to similar regression methods, on simulated and real chemical data.
dual.spls package
Provides a series of functions for fitting a dual sparse partial least squares (Dual-sPLS) regression. These functions differ by the choice of the underlying norm.
CalValXy splitting method
CalValXy is a procedure that improves the quality of information collected for database analysis through an algorithm that splits a database into two subsets, one for calibration and one for validation. It combines two approaches: the random sampling procedure and the one established by Kennard and Stone. Its originality lies in leveraging information form both the predictors and the outcome variables.
Near infrared spectra (NIRS) data
This dataset contains 208 pre-treated near infrared spectra of different heavy oil samples and their derivatives. Corresponding raw data were acquired and recorded in wavenumber range from 9000 to 4000 cm-1. Spectroscopy is popularly used to determine physico-chemical properties via forecast modelling. NIRS datasets are used to better understand the contribution of spectral ranges on density of petroleum samples for accurate predictions.
Thesis
Functional data regression with prediction and interpretability: property inference in chemometrics with sparse Partial Least Squares (PLS)
In this thesis, we focus on characterization of heavy petroleum products using multivariate calibration for predictive analysis and regression modeling. Our data are composed of spectral physico-chemical measurements (NIR, NMR...) of heavy oil that we aim to link mathematically to a set of one of their macroscopic properties (density, viscosity...). The corresponding built model will allow us to predict new oil samples properties. As data are high dimensional, our objective is to build an algorithm that enables us to manipulate, reduce and visualize this type of data, combine them together, have predictive accuracy, and select the most relevant variables.
Publications
Journal paper
Chemometrics and Intelligent Laboratory Systems, submitted in November 2022, "Dual-SPLS: a family of Dual Partial Least Squares regression with feature selection an tunable sparsity. Application to near-infrared data" published in June 2023
Data in brief, to be submitted in December 2023, "Heavy oil density prediction using Near-infrared spectral (NIRS) dataset with chemometrics predictive analysis" (Ongoing writing)
Chemometrics and Intelligent Laboratory Systems, to be submitted in October 2023, "CalValXy : well balanced and stratified calibration/validation splitting using both predictors X and reponse y " (Ongoing writing)
Journal of Statistical softwares, to be submitted in September 2023, "Dual sparse partial least squares: the dual.spls package" (Ongoing writing)
National conferences
Journée des statistiques de la Société Française de Statistique JDS22, in June 2022, “Sparse PLS with group lasso: inside the dual.spls package”
Journée des statistiques de la Société Française de Statistique JDS21, in June 2021, “A new and generalized method of sparse partial least squares: theory and applications”
From PhD to PhD: A conference Mapping the Network of Lebanese Mathematics, in June 2021, “A generalized method for Sparse PLS: Dual sparse partial least squares”
e-Chimiométire, in February 2021, “Interpretable Dual Sparse Partial Least Squares(Dual-SPLS) regression; Application to NMR/NIR petroleum data sets"
International conference
European Network for Business and Industrial Statistics, in June 2022, "Improving PLS with lasso shrinkage, using the dual.spls package"
Package publication
CRAN, in October 2022, "dual.spls"
Webinar
Monday Webinar Chemometrics and Machine Learning in Copenhagen, in April 2023, "Dual-sPLS: a versatile approach improving PLS with Lasso shrinkage"
Teaching
2021-2022
"Modèle de régression" for Master 2 SITN (Statistiques, Informatiques et Techniques numériques) and Data Science students at Claude Bernard University of Lyon 1 (numerical sessions).
"Generalized linear models" for Master 1 SAF (Science Actuarielle et Financière) and DSC (Data Science) students at Saint Joseph University of Beirut (theoretical course with exercise and numerical sessions).
2020-2021
"Modèle de régression" for Master 2 SITN (Statistiques, Informatiques et Techniques numériques) and Data Science students at Claude Bernard University of Lyon 1 (numerical sessions).
"Generalized linear models" for Master 1 SAF (Science Actuarielle et Financière) and DSC (Data Science) students at Saint Joseph University of Beirut (exercise and numerical sessions).