Publications
Incomplete List of Publications (peer reviewed, in order of acceptance) Altneratively check Google Scholar: https://scholar.google.com/citations?user=B-XccgMAAAAJ&hl=en)
Performance and Power Evaluation of Clustered VLIW Processors with Wide Functional Units. Miquel Pericàs, Eduard Ayguadé, Javier Zalamea, Josep Llosa and Mateo Valero. Proceedings of SAMOS'03 Workshop. Samos, 2003. Download
Power-Performance Trade-Offs in Wide and Clustered VLIW Cores for numerical Computations. Miquel Pericàs, Eduard Ayguadé, Javier Zalamea, Josep Llosa and Mateo Valero. Intl' Symposium on High Performance Computing (ISHPC-V). Tokyo, 2003. Published As: Lecture Notes on Computer Science Volume 2858/2003. pp. 113-126 Download.
High-Performance Low-Power VLIW Cores for Numerical Computations. Miquel Pericàs, Eduard Ayguadé, Javier Zalamea, Josep Llosa and Mateo Valero. International Journal of High Performance Computing and Networking 2004 - Vol. 1, No.4 pp. 171 - 179
Performance and Power Evaluation of Clustered VLIW Processors with Wide Functional Units. Miquel Pericàs, Eduard Ayguadé, Javier Zalamea, Josep Llosa and Mateo Valero. Lecture Notes in Computer Science. Volume 3133/2004, 2004. pp. 88-97.
Power-Efficient VLIW Design using Clustering and Widening. Miquel Pericàs, Eduard Ayguadé, Javier Zalamea, Josep Llosa and Mateo Valero. Int. J. of Embedded Systems, 2008 Vol.3, No.3, pp.141 - 149.
Scalable Distributed Register File. Ruben Gonzalez, Adrian Cristal, Miquel Pericàs, Alex Veidenbaum and Mateo Valero. Workshop on Complexity Effective Design (WCED). June 2004. Download
An Optimized Front-End Physical Register File with Banking and Writeback Filtering. Miquel Pericàs, Ruben Gonzalez, Adrian Cristal, Alex Veidenbaum and Mateo Valero. Workshop on Power Aware Computer Systems. December 2004. PDF PS
An Asymmetric Clustered Processor based on Value Content. Ruben Gonzalez, Adrian Cristal, Alex Veidenbaum, Miquel Pericas and Mateo Valero. International Conference on Supercomputing. June 2005
Kilo-Instruction Processors: Overcoming the Memory Wall. Adrian Cristal, Oliverio J. Santana, Francisco J. Cazorla, Marco Galluzzi, Tanausú Ramirez, Miquel Pericàs and Mateo Valero. IEEE MICRO. Vol 25, Nr 3. pp 48-57. May/June 2005
Exploiting Execution Locality with a Decoupled Kilo-Instruction Processor. Miquel Pericàs, Ruben Gonzalez, Adrian Cristal, Daniel A. Jimenez and Mateo Valero. International Symposium on High Performance Computing (ISHPC-VI) 2005
The Decoupled State/Execute Architecture. Miquel Pericàs, Ruben Gonzalez, Adrian Cristal, Alex Veidenbaum and Mateo Valero. International Symposium on High Performance Computing (ISHPC-VI) 2005
Chained In-Order/Out-of-Order DoubleCore Architecture. Miquel Pericàs, Ruben Gonzalez, Adrian Cristal, Daniel A. Jimenez and Mateo Valero. SBAC-PAD 2005 Download
A Decoupled KILO-Instruction Processor. Miquel Pericàs, Ruben Gonzalez, Adrian Cristal, Daniel A. Jimenez and Mateo Valero. The 12th Intl. Symp. on High Performance Computer Archiecture HPCA-12 (2006) Download
An Optimized Front-End Physical Register File with Banking and Writeback Filtering. Miquel Pericàs, Ruben Gonzalez, Adrian Cristal, Alex Veidenbaum and Mateo Valero. Lecture Notes in Computer Science. Volume 3471 / 2005. Pages 1 - 14
A Flexible Heterogeneous Multi-Core Architecture. Miquel Pericàs, Ruben Gonzalez, Francisco J. Cazorla, Adrian Cristal, Daniel A. Jimenez and Mateo Valero. The 2007 Intl. Conf. on Parallel Architecture and Compiler Techniques (PACT-2007) Download
Exploiting Execution Locality with a Decoupled Kilo-Instruction Processor. Miquel Pericàs, Ruben Gonzalez, Adrian Cristal, Daniel A. Jimenez and Mateo Valero. Lecture Notes in Computer Science. Volume 4759/2008. pp. 56-67
The Decoupled State/Execute Architecture. Miquel Pericàs, Ruben Gonzalez, Adrian Cristal, Alex Veidenbaum and Mateo Valero. Lecture Notes in Computer Science. Volume 4759/2008. pp. 68-78
Vectorized AES Core for high-throughput secure environments. Miquel Pericàs, Ricardo Chaves, Georgi N. Gaydadjiev, Mateo Valero and Stamatis Vassiliadis. VECPAR 2008, Tolouse, June 2008
A two-level Load/Store Queue based on Execution Locality. Miquel Pericas, Adrian Cristal, Ruben Gonzalez, Alex Veidenbaum, Daniel A. Jimenez and Mateo Valero. The 35th International Symposium on Computer Architecture (ISCA-35), Beijing, 2008 Download
Affordable KILO-Instruction Processors. Miquel Pericas. PhD Thesis, December 2008. Download
Exploiting Memory Customization in FPGA for 3D Stencil Computations. Muhammad Shafiq, Miquel Pericàs, Raul de la Cruz, Mauricio Araya-Polo, Nacho Navarro and Eduard Ayguadé. The 2009 International Conference on Field-Programmable Technology (FPT'09), Sydney, December 2009.
Row-interleaved streaming data flow implementation of Sparse Matrix Vector Multiplication in FPGA. Branimir Dickov, Miquel Pericàs, Nacho Navarro and Eduard Ayguadé. WRC 2010 - 4th HiPEAC Workshop on Reconfigurable Computing. Pisa, January 2010.
FEM : A Step Towards a Common Memory Layout for FPGA Based Accelerators. Muhammad Shafiq, Miquel Pericàs, Nacho Navarro and Eduard Ayguadé. FPL2010 The 20th International Conference on Field Programmable Logic and Applications, Milano, August-September 2010 Download
Assessing Accelerator-Based HPC Reverse Time Migration. Mauricio Araya-Polo, Javier Cabezas, Mauricio Hanzich, Miquel Pericàs, Felix Rubio, Isaac Gelado, Muhammad Shafiq, Enric Morancho, Nacho Navarro, Eduard Ayguadé, José María Cela and Mateo Valero. IEEE Transactions on Parallel and Distributed Systems. January 2011 (vol. 22 no. 1) pp. 147-162
Reconfigurable Memory Controller with Programmable Pattern Support. Tassadaq Hussain, Miquel Pericàs and Eduard Ayguadé. Workshop on Reconfigurable Computing 2011 (WRC 2011) Download
A Template System for the Efficient Compilation of Domain Abstractions onto Reconfigurable Computers. Muhammad Shafiq, Miquel Pericàs and Eduard Ayguadé. Workshop on Reconfigurable Computing (WRC 2011) Download
TARCAD: A Template Architecture for Reconfigurable Accelerator Designs. Muhammad Shafiq, Miquel Pericàs, Nacho Navarro and Eduard Ayguadé. The 9th IEEE Symposium on Application Specific Processors (SASP 2011)
Implementation of a Reverse Time Migration Kernel using the HCE High Level Synthesis Tool. Tassadaq Hussain, Miquel Pericas, Nacho Navarro and Eduard Ayguadé. The 2011 IEEE International Conference on Field-Programmable Technology (FPT'11), December 2011
Implementation of a Hierarchical N-Body Simulator Using the OmpSs Programming Model. Miquel Pericas, Xavier Martorell and Yoav Etsion. Workshop on Irregular Applications: Architectures & Algorithms (IAAA01), Seattle, November 2011
PPMC: A Programmable Pattern based Memory Controller. Tassadaq Hussain, Muhammad Shafiq, Miquel Pericas, Nacho Navarro and Eduard Ayguadé. The 8th International IEEE/ACM Symposium on Applied Reconfigurable Computing (ARC 2012), March 2012
BSArc: Blacksmith Streaming Architecture for HPC Accelerators. Muhammad Shafiq, Miquel Pericas, Nacho Navarro and Eduard Ayguade. The 2012 ACM International Conference on Computing Frontiers (CF 2012), May 2012
Assessing the impact of network compression on Molecular Dynamics and Finite Element Methods. Branimir Dickov, Miquel Pericas, Guillaume Houzeaux, Nacho Navarro and Eduard Ayguade. The 14th IEEE International Conference on High Performance Computing and Communication (HPCC-2012), June 2012
A Template System for the Efficient Compilation of Domain Abstractions onto Reconfigurable Computers. Muhammad Shafiq, Miquel Pericas, Nacho Navarro and Eduard Ayguade. Journal of Systems Architecture. February 2013 http://dx.doi.org/10.1016/j.sysarc.2012.10.002
Fork-Join and Data-Driven Execution Models on Multi-Core Architectures: Case study of the FMM. Abdelhalim Amer, Naoya Maruyama, Miquel Pericas, Kenjiro Taura, Rio Yokota and Satoshi Matsuoka. In Proceedings of the International Supercomputing Conference 2013 (ISC'13). Leipzig, June 2013
Design space explorations for streaming accelerators using streaming architectural simulator. Shafiq, M., Pericas, M., Navarro, N., Ayguade, E.Proceedings of 2013 10th International Bhurban Conference on Applied Sciences and Technology, IBCAST 2013 , art. no. 6512151 , pp. 169 - 178.
Analysis of Data Reuse in Task-Parallel Runtimes Miquel Pericàs, Abdelhalim Amer, Kenjiro Taura and Satoshi Matsuoka 4th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS13), Denver, November 2013. An improved version is available in: Lecture Notes in Computer Science, Springer, High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, pp 73-87, 2014
Scalable Analysis of Multicore Data Reuse and Sharing. Miquel Pericàs, Kenjiro Taura and Satoshi Matsuoka, The 2014 International Conference on Supercomputing (ICS’14), Munich, Germany, June 10–13 2014
Efficient String Sorting on Multi- and Many-Core Architectures. Aleksandr Drozd, Miquel Pericàs and Satoshi Matsuoka. 3rd International Congress on Big Data (BigData'14). June 27 - July 2, 2014, Anchorage, Alaska, USA
Software-Managed Power Reduction in Infiniband Links. Branimir Dickov, Miquel Pericàs, Paul Carpenter, Nacho Navarro and Eduard Ayguadé. The 43rd International Conference on Parallel Processing (ICPP'14). Minneapolis, September 9-12, 2014
Analyzing Performance Improvements and Energy Savings in Infiniband Architecture using Network Compression. Branimir Dickov, Miquel Pericàs, Paul Carpenter, Nacho Navarro and Eduard Ayguade. The 26th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Paris, France. October 22-24, 2014
DAGViz: A DAG Visualization Tool for Analyzing Task Parallel Program Traces. An Huynh, Douglas Thain, Miquel Pericas and Kenjiro Taura. 2nd Workshop on Visual Performance Analysis (VPA), Austin, November 20, 2015
Self-Tuned Software-Managed Energy Reduction in Infiniband Links. Branimir Dickov, Paul Carpenter, Miquel Pericàs and Eduard Ayguade, 21st IEEE International Conference on Parallel and Distributed Systems (ICPADS 2015), December 14-17, 2015.
Scalable and Locality-aware Resource Management with Task Assembly Objects. Miquel Pericas. Workshop on Runtime Systems for Extreme Scale Programming Models and Architectures (RESPA'15). Austin, Texas, November 16th, 2015.
RADAR: Runtime-Assisted Dead Region Management for Last-Level Caches. Madhavan Manivannan, Vassilis Papaefstathiou, Miquel Pericàs, Per Stenström. 22nd Intl. Symp. on High Performance Computer Architecture (HPCA-22). March 2016
Scaling FMM with Data-Driven OpenMP Tasks on Multicore Architectures. Abdelhalim Amer, Satoshi Matsuoka, Miquel Pericas, Naoya Maruyama, Kenjiro Taura, Rio Yokota and Pavan Balaji. 12th International Workshop on OpenMP. In Nara, Japan, October 5-7, 2016.
POSTER: ξ – TAO: A cache-centric execution model and runtime for deep parallel multicore topologies. Miquel Pericàs. 25th International Conference on Parallel Architectures and Compilation Techniques (PACT 2016). In Haifa, Sept 11-15, 2016.
Runtime-Assisted Global Cache Management for Task-based Parallel Programs. Madhavan Manivannan, Miquel Pericàs, Vassilis Papaefstathiou and Per Stenström. IEEE Computer Architecture Letters. 2017
Trends in Data Locality Abstractions for HPC systems, Unat, D.; Dubey, A.; Hoefler, T.; Shalf, J.; Abraham, M.; Bianco, M.; Chamberlain, B. L.; Cledat, R.; Edwards, H. C.; Finkel, H.; Fuerlinger, K.; Hannig, F.; Jeannot, E.; Kamil, A.; Keasler, J.; Kelly, P. H. J.; Leung, V.; Ltaief, H.; Maruyama, N.; Newburn, C. J. & Pericas, M. IEEE Transactions on Parallel and Distributed Processing Systems. 2017
SWAS: Stealing Work using Approximate System-load Information, Stavros Tzilis, Miquel Pericas, Pedro Trancoso and Ioannis Sourdis, 13th International Workshop on Scheduling and Resource Management for Parallel and Distributed Systems (SRMPDS), In Bristol, August 14th 2017.
Elastic Places: an adaptive resource manager for scalable and portable performance. Miquel Pericàs, ACM Transactions on Architecture and Code Optimization, Volume 15 Issue 2, May 2018
Adrian Cristal, Osman S. Unsal, Xavier Martorell, Paul Carpenter, Raul De La Cruz, Leonardo Bautista, Daniel Jimenez, Carlos Alvarez, Behzad Salami, Sergi Madonar, Miquel Pericàs, Pedro Trancoso, Micha vor dem Berge, Gunnar Billung-Meyer, Stefan Krupop, Wolfgang Christmann, Frank Klawonn, Amani Mihklafi, Tobias Becker, Georgi Gaydadjiev, Hans Salomonsson, Devdatt Dubhashi, Oron Port, Yoav Etsion, Vesna Nowack, Christof Fetzer, Jens Hagemeyer, Thorsten Jungeblut, Nils Kucza, Martin Kaiser, Mario Porrmann, Marcelo Pasin, Valerio Schiavoni, Isabelly Rocha, Christian Göttel, and Pascal Felber. 2018. LEGaTO: towards energy-efficient, secure, fault-tolerant toolset for heterogeneous computing. In Proceedings of the 15th ACM International Conference on Computing Frontiers (CF '18).
Global Dead-Block Management for Task-Parallel Programs. Madhavan Manivannan, Miquel Pericás, Vassilis Papaefstathiou, Per Stenström. ACM Transactions on Architecture and Code Optimization, Vol 15, Issue 3, August 2018
High performance scheduling of mixed-mode DAGs on heterogeneous multicores. Agnes Rohlin, Henrik Fahlgren and Miquel Pericas. 7th International Workshop on High Performance Energy Efficient Embedded Systems (HIP3ES-2019). Jan 22nd, 2019
QoS-Driven Coordinated Management of Resources to Save Energy in Multicore Systems. Mehrzad Nejat, Miquel Pericas and Per Stenström. 33rd IEEE International Parallel and Distributed Processing Symposium. Rio de Janeiro, Brazil. May 20-24, 2019
Muhammad Waqar Azhar, Miquel Pericàs, Per Stenström. SaC: Exploiting Execution-Time Slack to Save Energy in Heterogeneous Multicore Systems. 48th International Conference on Parallel Processing (ICPP 2019), Kyoto, August 2019
Jing Chen, Madhavan Manivannan, Mustafa Abduljabbar, Miquel Pericas. Towards an Energy Aware Task Scheduler for Asymmetric Architectures. Nordic Workshop on Multicore Computing (MCC2019), Karlskrona, November 2019
Jeckson Dellagostin Souza, Madhavan Manivannan, Miquel Pericas and Antonio Carlos Schneider Beck, Enhancing Multithreaded Performance of Asymmetric Multicores with SIMD offloading. Design, Automation, and Test in Europe Conference (DATE'2020), March 2020
Behzad Salami, Konstantinos Parasyris, Adrián Cristal, O Unsal, Xavier Martorell, Paul Carpenter, Raúl De La Cruz, L Bautista, D Jimenez, C Alvarez, S Nabavi, Sergi Madonar, Miquel Pericàs, Pedro Trancoso, M Abduljabbar, Jing Chen, Pirah Noor Soomro, Madhavan Manivannan, M Berge, Stefan Krupop, Frank Klawonn, Al Mekhlafi, Sigrun May, Tobias Becker, Georgi Gaydadjiev, H Salomonsson, D Dubhashi, O Port, Y Etsion, Christof Fetzer, M Kaiser, N Kucza, J Hagemeyer, R Griessl, L Tigges, K Mika, A Hüffmeier, M Pasin, V Schiavoni, I Rocha, C Göttel, P Felber. LEGaTO: low-energy, secure, and resilient toolset for heterogeneous computing. Design, Automation, and Test in Europe Conference (DATE'2020), March 2020
Mehrzad Nejat, Madhavan Manivannan, Miquel Pericas, and Per Stenström. Coordinated Management of Processor Configuration and Cache Partitioning to Optimize Energy under QoS Constraints.IPDPS-2020, New Orleans, May 2020.
Nadja Holtryd, Madhavan Manivannan, Per Stenström, and Miquel Pericàs.DELTA: Distributed Locality-Aware Cache Partitioning for Tile-based Chip Multiprocessors. IPDPS-2020, New Orleans, May 2020.
Jeckson Dellagostin Souza, Madhavan Manivannan, Miquel Pericas and Antonio Carlos Schneider Beck. Enhancing Thread-Level Parallelism in Asymmetric Multicores using Transparent Instruction Offloading. Design Automation Conference (DAC), 2020
Jing Chen, Pirah Noor Soomro, Mustafa Abduljabbar, Madhavan Manivannan, Miquel Pericàs. Scheduling Task-parallel Applications in Dynamically Asymmetric Environments. ICPP Workshops 2020: 18:1-18:10
Nejat, Mehrzad, Madhavan Manivannan, Miquel Pericàs, and Per Stenström. "Coordinated management of DVFS and cache partitioning under QoS constraints to save energy in multi-core systems." Journal of Parallel and Distributed Computing (2020).
Pirah Noor Soomro, Mustafa Abduljabbar, Jeronimo Castrillon, and Miquel Pericàs. An online guided tuning approach to run CNN pipelines on edge devices. 18th ACM International Conference on Computing Frontiers (CF'21) , 2021.
Muhammad Nufail Farooqi and Miquel Pericas. Vectorized Barrier and Reduction in LLVM OpenMP Runtime. IWOMP 2021
Nadja Ramhöj Holtryd, Madhavan Manivannan, Per Stenström and Miquel Pericàs. CBP: Coordinated management of cache partitioning, bandwidth partitioning and prefetch throttling, PACT 2021
Muhammad Waqar, Miquel Pericas, Per Stenström. Task-RM. A resource manager for energy reduction in task-parallel applications under Quality of Service constraints. ACM TACO January 2022
Mehrzad Nejat, Madhavan Manivannan, Miquel Pericas, Per Stenström. Cooperative Slack Management: Saving Energy of Multi-Core Processors by Trading Performance Slack Between QoS Constrained Applications, ACM TACO
Jing Chen, Madhavan Manivannan, Mustafa Abduljabbar, and Miquel Pericas. ERASE: Energy Efficient Task Mapping and Resource Management for Work Stealing Runtimes. ACM TACO
Community Whitepapers
Programming Abstractions for Data Locality, D. Unat, J. Shalf, T. Hoefler, T. Schulthess, A. Dubey et. al., White Paper, PADAL Workshop, 28-29 April, 2014, Lugano Switzerland
Invited papers
LEGaTO: Towards Energy-Efficient, Secure, Fault-tolerant Toolset for Heterogeneous Computing, Cristal et al. ACM International conference on Computing Frontiers'18, May 2018, Ischia, Italy
LEGaTO: First Steps Towards Energy-Efficient Toolset for Heterogeneous Computing . Cristal et al. International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS-XVIII). Samos, Greece, July 2018
Technical reports:
Banked Front-End Register File. Miquel Pericàs, Ruben Gonzalez, Adrian Cristal, Alex Veidenbaum and Mateo Valero. Technical Report. Reference: UPC-DAC-2004-35. October 2004
A Flexible Heterogeneous Multi-Core Architecture. Miquel Pericas, Adrian Cristal, Francisco J. Cazorla, Ruben González, Daniel A. Jimenez, Mateo Valero. Technical Report. Reference: UPC-DAC-RR-CAP-2006-15. August 2006
A Reconfigurable Heterogeneous Multi-Core Architecture. Miquel Pericas, Adrian Cristal, Francisco J. Cazorla, Ruben González, Daniel A. Jimenez, Mateo Valero. Technical Report. Reference: UPC-DAC-RR-CAP-2007-1. February 2007
Vectorized AES Core for high-throughput secure environments. Miquel Pericàs, Ricardo Chaves, Georgi N. Gaydadjiev, Mateo Valero and Stamatis Vassiliadis. The Future of Computing - essays in memory of Stamatis Vassiliadis (2007) Download
Towards a Dataflow FMM using the OmpSs Programming Model. Miquel Pericas, Abdelhalim Amer, Keisuke Fukuda, Naoya Maruyama, Rio Yokota, Satoshi Matsuoka. In 第136回ハイパフォーマンスコンピューティング研究発表会 (136th IPSJ Conference on High Performance Computing). October 2012 Download
DAG Recorder: A Task-Centric Tracing Algorithm for Task Parallel Applications, Kenjiro Taura, Yuhei Kikuchi, Jun Nakashima and Miquel Pericàs. Annual Meeting on Advanced Computing System and Infrastructure (ACSI), January 26-28, 2015
RADAR: Runtime-Assisted Dead Region Management for Last-Level Caches. Madhavan Manivannan, Vassilis Papaefstathiou, Miquel Pericàs and Per Stenström. Technical report - Department of Computer Science and Engineering, Chalmers University of Technology and Göteborg University, 2015
Task Assembly Objects: a Cache-centric Execution Model and its Prototype Runtime Implementation. Miquel Pericàs. Technical report 2016.1 - Department of Computer Science and Engineering, Chalmers University of Technology, 2016.
Posters:
POSTER: ξ -TAO: an exascale execution model based on dynamic co-scheduling of tasks. Miquel Pericas, Stavros Tzilis and Ioannis Sourdis. 2016 Exascale Applications & Software Conference, Stockholm. 2016
POSTER: Runtime-management of manycore cache hierarchies, Nadja Holtryd, Per Stenström and Miquel Pericas, The 12th HiPEAC Conference, In Stockholm, January 2017
POSTER: Towards distributed self-guided caches, Nadja Holtryd, Per Stenström and Miquel Pericas, Thirteenth International Summer School on Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems 9-15 July 2017, Fiuggi, Italy
In Other Languages:
動的タスクスケジューリングエンジンStarPUによるKIFMMの実装と性能評価. 福田圭祐,丸山直也,Miquel Pericas,松岡聡. In 第136回ハイパフォーマンスコンピューティング研究発表会 (136th IPSJ Conference on High Performance Computing). October 2012 (in japanese)