About my thesis
About my thesis
In order to obtain the best possible partition and to ensure a good convergence, I worked during my thesis to apply more robust optimization methods. The simple study of the first order optimality conditions of the non-convex cost function is not always sufficient to obtain the best classification.
The experimental results show the interest of this new method with large data sizes. An article is in the process of being submitted.
Paper presented at OLA 2023
Optimization of Fuzzy C-Means with Alternating Direction Method of Multipliers
Paper presented at EGC 2023
From an optimization point of view, we can think we have the best partition when we have found the parameters that minimize the function.
However, this partition minimizes the criterion of the cost function. It is important to compare it with other evaluation criteria. Among the validity indexes, there are the external criteria that allow to say how much two partitions are similar. And the internal criteria which allow to evaluate the compactness of the elements within a cluster or/and the separability of the clusters.
There is no criterion taking into account the Mahalanobis distance. With the help of Shidi Deng, a trainee that I was able to supervise, we have developed a criterion.
Paper presented at EUSFLAT 2023
A Specialized Xie-Beni Measure for Clustering with Adaptive Distance
ECM, the evidential variant of K-means, is based on the theory of belief functions. Partitioning the dataset into c clusters is studied according to the belief mass allocated to each subset. This makes it possible to model imprecision. With the Mahalanobis distance, the imprecision zone is poorly modeled by the barycentric formulation.
In this context, a new definition of prototypes for the subsets is proposed. The ECM objective function is then optimized using the new definition of prototypes. The subsequent algorithm, named ECM+, is finally tested on various synthetic and real data sets to demonstrate its interest compared to ECM.