IRIDIA, Université Libre de Bruxelles
50, Av. F. Roosevelt, CP 194/6
B-1050, Brussels, Belgium
Contact:
moises.silva.munoz [at] ulb.be

Moisés Silva-Muñoz

I am a PhD student since September 2019 at IRIDIA the Artificial Intelligence lab of the Université Libre de Bruxelles .

My thesis supervisor is Prof. Hugues Bersini.

I am a computer engineer and I have a master's degree in computer engineering from the University of Santiago de Chile. After finishing my master's thesis I worked at the Laboratory of Optimization and Parallelism of the University of Santiago of Chile, in the Dicyt project entitled "Study of exact terminals and the algorithms specialization" in the Maximal Independent Set Independent Set Problem, a graph ́s problem. Now, I am currently working my Phd in the area of database performance optimization.

I am funded by the CHIST-ERA project CHIST-ERA-17-BDSI-001 ABIDI "Context-aware and Veracious Big Data Analytics for Industrial IoT''.

Publications

Moisés Silva-Muñoz, Gonzalo Calderon, Alberto Franzin, and Hugues Bersini. 2021. "Automatic configuration of the Cassandra database using irace". In PeerJ Computer Science, accepted.

Abstract: Database systems play a central role in modern data-centered applications. Their performance is thus a key factor in the efficiency of data processing pipelines. Modern database systems expose several parameters that users and database administrators can configure to tailor the database settings to the specific application considered. While this task has traditionally been performed manually, in the last years several methods have been proposed to automatically find the best parameter configuration for a database. Many of these methods, however, use statistical models that require high amounts of data and fail to represent all the factors that impact the performance of a database, or implement complex algorithmic solutions. In this work we study the potential of a simple model-free general-purpose configuration tool to automatically find the best parameter configuration of a database. We use the irace configurator to automatically find the best parameter configuration for the Cassandra NoSQL database using the YCBS benchmark under different scenarios. We establish a reliable experimental setup and obtain speedups of up to 30% over the default configuration in terms of throughput, and we provide an analysis of the configurations obtained.

Moisés Silva-Muñoz, Gonzalo Calderon, Alberto Franzin, and Hugues Bersini. 2021. "Determining a consistent experimental setup for benchmarking and optimizing databases". In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '21). Association for Computing Machinery, New York, NY, USA, 1614–1621. DOI:https://doi.org/10.1145/3449726.3463180

Abstract: The evaluation of the performance of an IT system is a fundamental operation in its benchmarking and optimization. However, despite the general consensus on the importance of this task, little guidance is usually provided to practitioners who need to benchmark their IT system. In particular, many works in the area of database optimization do not provide an adequate amount of information on the setup used in their experiments and analyses. In this work we report an experimental procedure that, through a sequence of experiments, analyzes the impact of various choices in the design of a database benchmark, leading to the individuation of an experimental setup that balances the consistency of the results with the time needed to obtain them. We show that the minimal experimental setup we obtain is representative also of heavier scenarios, which make it possible for the results of optimization tasks to scale.

Moisés Silva-Muñoz, Carlos Contreras-Bolton, Gustavo Silva Semaan, Mónica Villanueva and Víctor Parada, "Novel Algorithms Automatically Generated for Optimization Problems," 2019 38th International Conference of the Chilean Computer Science Society (SCCC), 2019, pp. 1-7, DOI: 10.1109/SCCC49216.2019.8966437.

Abstract: A difficult task in computer science is designing algorithms. This task is particularly complex when efficient algorithms must be constructed for computationally difficult optimization problems. Two fundamental problems in both combinatorial optimization and machine learning are the maximum independent set problem and the automatic data clustering. The best specific algorithms for both problems are heuristic and have been constructed by combining primary heuristics. However, the possible combinations explored so far are a minimum number of the entire combinatorial space. The automatic exploration of such space would, therefore, allow finding more efficient algorithmic combinations. This article reports new algorithms for both problems constructed by the automatic generation of algorithms, an emerging field that allows to automatically produce an adequate algorithm for a specific set of instances of the problem. The best algorithms generated after the computational experimentation are reported and compared with existing heuristics, demonstrating that the new algorithms are competitive.