Louna Alsouki

 Data scientist

PhD student in applied mathematics at 

Claude Bernard University Lyon 1, Camille Jordan Institute; 

in cotutelle with Saint Joseph University of beirut; 

in collaboration with IFP énergies nouvelles



Address


Claude Bernard University of Lyon 1

Camille Jordan Institute

Bâtiment Braconnier

43 boulevard du 11 novembre 1918

69622 Villeurbanne Cedex

Office 131

News

Thesis defense 

This event was held on Thursday June 15th, at 10 am, and organized at room Fokko du Cloux, Braconnier, la Doua campus and online via Zoom. A drink was arranged after the defense in the lab’s meeting room. An online link of the recorded presnetation will soon be available. 


The thesis was defended publicly in front of the jury members:

Dual Sparse Partial Least Squares

Dual-sPLS generalizes the classical PLS1 algorithm. It provides balance between accurate prediction and efficient interpretation. It is based on penalizations inspired by classical regression methods (lasso, group lasso, least squares, ridge) and uses the dual norm notion. The resulting sparsity is enforced by an intuitive shrinking ratio parameter. Dual-sPLS favorably compares to similar regression methods, on simulated and real chemical data. 

dual.spls package 

Provides a series of functions for fitting a dual sparse partial least squares (Dual-sPLS) regression. These functions differ by the choice of the underlying norm.

CalValXy splitting method

CalValXy is a procedure that improves the quality of information collected for database analysis through an algorithm that splits a database into two subsets, one for calibration and one for validation. It combines two approaches: the random sampling procedure and the one established by Kennard and Stone. Its originality lies in leveraging information form both the predictors and the outcome variables. 

Near infrared spectra (NIRS) data  

This dataset contains 208 pre-treated near infrared spectra of different heavy oil samples and their derivatives. Corresponding raw data were acquired and recorded in wavenumber range from 9000 to 4000 cm-1. Spectroscopy is popularly used to determine physico-chemical properties via forecast modelling. NIRS datasets are used to better understand the contribution of spectral ranges on density of petroleum samples for accurate predictions. 

Thesis

Functional data regression with prediction and interpretability: property inference in chemometrics with sparse Partial Least Squares (PLS)

In this thesis, we focus on characterization of heavy petroleum products using multivariate calibration for predictive analysis and regression modeling. Our data are composed of spectral physico-chemical measurements (NIR, NMR...) of heavy oil that we aim to link mathematically to a set of one of their macroscopic properties (density, viscosity...). The corresponding built model will allow us to predict new oil samples properties.  As data are high dimensional, our objective is to build an algorithm that enables us to manipulate, reduce and visualize this type of data, combine them together, have predictive accuracy, and select the most relevant variables. 

Publications

Journal paper

National conferences

International conference  

Package publication

Webinar

Teaching

2021-2022

2020-2021