Research Activities

Internships

Universidad Complutense de Madrid @ ArTeCS research group. October 2024 to September 2025.
Barcelona Supercomputing Center @ AccelCom research group. May 2023 to July 2023.
Barcelona Supercomputing Center @ AccelCom research group. December 2022 to February 2023.
Universitat Jaume I de Castelló @ HPC&A research group. May to July 2021.
Barcelona Supercomputing Center @ PM research group. March to April 2017.
Argonne National Laboratory @ PMRS research group. May to August 2015.

Grants & Awards

CIAEST from Generalitat Valenciana (GVA). CIAEST/2026/015
R3 2025 from Agencia Estatal de Investigación. CR32025-051809
Post-doctoral grant from APOSTD 2023 program of the Generalitat Valenciana (GVA). CIAPOS/2023/431
Post-doctoral grant from Juan de la Cierva - Formación program of the Ministerio de Ciencia y Universidades (Spanish Government). FJC2019-039222-I
Pre-doctoral grant from Vali+D 2015 program of the Generalitat Valenciana (GVA). ACIF/2015/167
HPC4Lab grant from Horizon 2020 program of HiPEAC. #671610
PRACE Preparatory Access project. 13,200 core hours. #2010PA5531 (Oct-Dec 2020)
PRACE Preparatory Access project. 13,200 core hours. #2010PA5650 (Feb-Mar 2021)

Attended Courses

ACM Europe Summer School, Organized by ACM, Barcelona, Spain. 2021
ACM Europe Summer School, Organized by ACM, Barcelona, Spain. 2019
Advanced Computer Architecture and Compilation for High-performance and Embebed Systems, ACACES, Organized by HIPEAC, Fiuggi, Italia. 2016
International Summer School on HPC Challenges in Computation Science, OpenMP and MPI Programming Models, Organized by PRACE, XSEDE, AICS and Compute Canada, Ljubljana, Slovenia. 2016
Parallel Programming, OpenMP and OmpSs Programming Models, PATC @ BSC, Organized by PRACE and BSC, Barcelona, Spain. 2015
Parallel Programming with MPI, Introduction and Advanced Concepts of MPI, Teached by Pavan Balaji, Rajeev Thakur, Ken Raffenetti, Antonio Peña and Min Si. Argonne National Laboratory, Lemont, IL (USA) 2015
Introduction to Parallel Programming, Using CUDA to Harness the Power of GPUs, UDACITY (www.udacity.com), Teached by David Luebke and John Owens. 2012
Programming and tUning Massively Parallel Systems, PUMPS, Teached by Wen-mei W. Hwu and David B. Kirk, BSC and UPC, Barcelona, Spain. 2012

Talks

Seminars

Optimizando Transformers con Código Autogenerado. XV Seminario de Invierno de CAPAP-H. Jan 25. Cáceres, Spain.
Automatic Code Generation for Small Matrix Multiplication. BLIS retreat 2022. Sept 2022.
Improving BLIS code-generation and portability via TVM. BLIS retreat 2021. Oct 2022.
Porting the Interoperability to Lightweight Threads. Barcelona Supercomputing Center. Spain. Nov 2016.
GLT: A Unified API for Lightweight Thread Libraries. Universitat Jaume I, Spain, Nov 2016.
Improving the Performance of OpenMP Using Lightweight Threads. Argonne National Laboratory. Lemont, IL (USA). Aug. 2015.
Introduction to InfiniBand. Universitat Jaume I, Spain. Jun. 2013.

International Conferences

Tackling the Matrix Multiplication Micro-kernel Generation with EXO. The International Symposium on Code Generation and Optimization -- CGO 2024, Edinburg (Scotland). 2024.
Anatomy of the BLIS Family of Algorithms for Matrix Multiplication. 30th Euromicro Workshop on Parallel and Distributed Processing -- PDP 2022, Valladolid (Spain). 2022.
Evaluation of MPI Allreduce for distributed training of convolutional neural networks. 29th Euromicro Workshop on Parallel and Distributed Processing -- PDP 2021, Valladolid (Spain). 2021.
Analysis of model parallelism for distributed neural networks 26th European MPI Users' Group Meeting -- Euro-MPI 2019, Zürich (Switzerland). Sep. 2019.
GLT: A Unified API for Lightweight Thread Libraries. IEEE International Conference on Parallel and Distributed Computing. Santiago de Compostela (Spain). Sept. 2017.
GLTO: On the Adequacy of Lightweight Thread Approaches for OpenMP Implementations. IEEE International Conference on Parallel Processing (ICPP 2017). Bristol (UK). Aug. 2017.
A Review of Lightweight Thread Approaches for High Performance Computing. International Conference on Cluster Computing (CLUSTER 2016). Taipei, (Taiwan). Sept. 2016.
Exploring the Suitability of Remote GPGPU Virtualization for the OpenACC Programming Model Using rCUDA. International Conference on Cluster Computing (CLUSTER 2015). Chicago, IL (USA). Sept. 2015.
On the Use of Remote GPUs and Low-Power Processors for the Acceleration of Scientific Applications. The Fourth International Conference on Smart Grids, Green Communications and IT Energy-aware Technologies (ENERGY 2014). Chamonix (France). Apr. 2014.

Page updated

Google Sites

Report abuse