Pier Stanislao PAOLUCCI
Publications and grants listed on:
ORCID researcher-ID: http://orcid.org/0000-0003-1937-6086
scholar.google: http://scholar.google.it/citations?user=jhvLaT8AAAAJ
SCOPUS author-ID: 55324291700 (and other IDs, sorry)
ISI-WEB researcher-ID: http://www.researcherid.com/rid/J-3457-2012
Present Position (from April 1st, 2020)
Human Brain Project, <Networks underlying brain cognition and consciousness> WorkPackage, Deputy Leader
National Coordinator of the INFN project HBP_WaveScales and Permanent Staff Researcher at INFN Roma, Ape Massive Parallel Computers Design Group
Previous Positions
Human Brain Project, Deputy Leader, Cognitive and Systems Neuroscience (SubProject 3) (2016-2020)
Coordinator of the WaveScalES experiment, WP3.2 in the the Human Brain Project (2016-2020)
Coordinator of the European Collaboratve Project EURETILE: EUropean REference TIled architecture Experiment (2010-2014)
Coordinator of the European Integrated FP6 Project SHAPES: Scalable Software Hardware Architecture Platform for Embedded Systems (2006-2009)
CTO, Atmel Roma design center (2000-2010).
Architect of Diopsis 740 and Diopsis 940 MPSoCs (ARM + mAgic VLIW DSP)
Architect of mAgic/mAgicV VLIW floating-point DSP Core
Permanent Staff Researcher (part-time) INFN Roma, Ape Massive Parallel Computers Design Group
Since 1984, member of the INFN APE (1984-1988), APE100 (1989-1994) and APEmille (1995-2000) design teams of Massive Parallel/Distributed Computers
Co-Founder of IPITEC (Intellectual Property Initiative for Tools and Embedded Cores) (1997-2000)
EUREKA DIAM !2390 Project - Italian Coordinator
Architecture Responsible of the ESPRIT FP4 European Project mAgic-FPU 27000 (1997-2000)
Promoting Member, BRINSAR (Brain Inspired Architectures) Research Organization
2015-03-17: NEWS on european commission about EURETILE results:
Keywords about activities of Pier Stanislao PAOLUCCI:
Human Brain Project, SP3 WaveScalES
DPSNN - STDP: Distributed Spiking Neural Net simulator with Spike Timing Dependent Plasticity
EURETILE European FP7 Research Project (2010-2014)
INFN APE SIMD Massive Parallel Computing Project (Ape Home Page)
SHAPES European FP6 Research Project (2006-2009)
ATMEL Diopsis 740 and Diopsis 940 (ARM + mAgic DSP) MPSoC
ATMEL mAgic and mAgicVLIW Floating Point DSP Core
DIAM (Digital AM) EUREKA !2390 Project (2000-2006)
mAgic FP4 ESPRIT 2700 Project (Music and antenna applications ASIC-based general-purpose IC-core floating point unit) - From 1998-03-01 to 2000-08-31
MADE: Modular Architecture/Application Development Environment (Science Direct portal)
DyProDe Patent: Dynamic Program Decompression Device for VLIW Program Compression
Cubed Sphere: simulation of spherical problems on next-neighbors parallel computer topologies
SIMD Matrix Transposition on next-neighbors topologies
SIMD FFT on next neighbors topologies
Evolving Grammars and Dynamic Parsers (ACM portal)
Zz Parser
TAO Programming Language
Esprit 27000 mAgicg-FIPU
DIAM Digital AM Radio Eureka
Neural Networks on next neighbors topologies
Fully connected systems on next neighbors topologies
Partial List of Patents, Publications on Scientific Journals/Conference Proceedings (partially updated after 2003)
For an updated list of publication see
scholar.google user link: http://scholar.google.it/citations?user=jhvLaT8AAAAJ
Patents
US7,437,540 Patent (14 Oct 2008) P.S. Paolucci et al. " Complex Domain Floating-Point VLIW DSP with Dat/Progra, Bus Multiplexer and Microprocessor Interface" about mAgic VLIW Floating point DSP and Diopsis 740 RISC+VLIW MPSoC architectures (downloadable from the Patents page).
US 6,766,439 Patent (20 Jul 2004) P.S. Paolucci "Apparatus and Method for Dynamic Program Decompression" about SW Compression and HW Decompression of VLIW Program (downloadable from the Patents page)
Publications/Presentations
Hereafter a partial list of publications
Celotto, M., et al. Analysis and Model of Cortical Slow Waves Acquired with Optical Techniques. Methods Protoc. 2020, 3, 14 https://doi.org/10.3390/mps3010014
Bio-intelligenza artificiale. PS Paolucci. Asimmetrie 27, 38-39 (2019) 10.23801/asimmetrie.2019.27.9
Scaling of a Large-Scale Simulation of Synchronous Slow-Wave and Asynchronous Awake-Like Activity of a Cortical Model With Long-Range Interconnections E Pastorelli, C Capone, F Simula, MV Sanchez-Vives, P Del Giudice, et al. Frontiers in Systems Neuroscience (2019) 13, 33. https://doi.org/10.3389/fnsys.2019.00033
Sleep-like slow oscillations improve visual classification through synaptic homeostasis and memory association in a thalamo-cortical model. C Capone, P Elena, B Golosio, PS Paolucci. Scientific Reports 9 1 (2019) https://www.nature.com/articles/s41598-019-45525-0
NaNet: a Reconfigurable PCIe Network Interface Card Architecture for Real-time Distributed Heterogeneous Stream Processing in the NA62 Low Level Trigger. P Cretaro, F Lo Cicero, A Lonardo, M Sozzi, R Piandani, A Biagioni, ... PoS, 118 (2019 )
Analysis pipeline for extracting features of cortical slow oscillations G De Bonis, M Dasilva, A Pazienti, MV Sanchez-Vives, M Mattia, ... Frontiers in Systems Neuroscience 13, 70 (2019) doi:10.3389/fnsys.2019.00070
Real-Time Cortical Simulations: Energy and Interconnect Scaling on Distributed Systems F Simula, E Pastorelli, PS Paolucci, M Martinelli, A Lonardo, A Biagioni, ... 2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), Pavia, Italy, 2019. 10.1109/EMPDP.2019.8671627
Analysis and Model of Cortical Slow Waves Acquired with Optical Techniques M Celotto, C De Luca, P Muratore, F Resta, ALA Mascaro, F Pavone, ... arXiv preprint arXiv:1811.11687 (2018)
Real-time heterogeneous stream processing with NaNet in the NA62 experiment R Ammendola, M Barbanera, A Biagioni, P Cretaro, O Frezza, G Lamanna, ... Journal of Physics: Conference Series 1085 (3), 032022
Next generation of Exascale-class systems: ExaNeSt project and the status of its interconnect and storage development M Katevenis, R Ammendola, A Biagioni, P Cretaro, O Frezza, FL Cicero, ... Microprocessors and Microsystems 61, 58-71 9 (2018) https://doi.org/10.1016/j.micpro.2018.05.009
Gaussian and exponential lateral connectivity on distributed spiking neural network simulation E Pastorelli, PS Paolucci, F Simula, A Biagioni, F Capuani, P Cretaro, Giulia De Bonis, Francesca Lo Cicero, Alessandro Lonardo, Michele Martinelli, Luca Pontisso, Piero Vicini, Roberto Ammendola 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) 10.1109/PDP2018.2018.00110
Parallel Computing is Everywhere S Bassini, M Danelutto, P Dazzi IOS Press 2 (2018
The INFN COSA Project: Low-Power Computing and Storage Daniele Cesini, Elena Corni, Antonio Falabella, Andrea Ferraro, Luca Lama, Lucia Morganti, Enrico Calore, Sebastiano Fabio Schifano, Michele Michelotto, Roberto Alfieri, Roberto De Pietri, Tommaso Boccali, Andrea Biagioni, Francesca Lo Cicero, Alessandro Lonardo, Michele Martinelli, Pier Stanislao Paolucci, Elena Pastorelli, Piero Vicini. Advances in Parallel Computing 32, 770-779 1 2018 https://doi.org/10.3233/978-1-61499-843-3-770
Large Scale Low Power Computing System – Status of Network Design in ExaNeSt and EuroExa Projects. Roberto Ammendola, Andrea Biagioni, Fabrizio Capuani, Paolo Cretaro, Giulia De Bonis, Francesca Lo Cicero, Alessandro Lonardo, Michele Martinelli, Pier Stanislao Paolucci, Elena Pastorelli, Luca Pontisso, Francesco Simula, Piero Vicini. Advances in Parallel Computing 32, 750-759 1 2018 https://doi.org/10.3233/978-1-61499-843-3-750
The Brain on Low Power Architectures - Efficient Simulation of Cortical Slow Waves and Asynchronous States Piero Vicini Roberto Ammendola, Andrea Biagioni, Fabrizio Capuani, Paolo Cretaro, Giulia De Bonis, Francesca Lo Cicero, Alessandro Lonardo, Michele Martinelli, Pier Stanislao Paolucci, Elena Pastorelli, Luca Pontisso, Francesco Simula. Advances in Parallel Computing 32, 760-769 1 2018 https://doi.org/10.3233/978-1-61499-843-3-760
Real-time track-less Cherenkov ring fitting trigger system based on Graphics Processing Units R Ammendola, A Biagioni, S Chiozzi, P Cretaro, A Cotta Ramusino, Stefano Di Lorenzo, R Fantechi, M Fiorini, O Frezza, A Gianoli, Gianluca Lamanna, F Lo Cicero, A Lonardo, M Martinelli, I Neri, PS Paolucci, E Pastorelli, R Piandani, M Piccini, L Pontisso, D Rossetti, F Simula, M Sozzi, P Vicini. Nuclear Instruments and Methods in Physics Research Section A: Accelerators , Spectrometers, Detectors and Associated Equipment 2017https://doi.org/10.1016/j.nima.2017.02.031
Latest generation interconnect technologies in APEnet+ networking infrastructure R Ammendola, F Lo Cicero, A Lonardo, P Cretaro, A Biagioni, O Frezza, ... J. Phys. Conf. Ser. 898, 082035 1 2017 10.1088/1742-6596/898/8/082035
Low latency network and distributed storage for next generation HPC systems: the ExaNeSt project R Ammendola, F Chaix, M Katevenis, A Biagioni, PS Paolucci, F Pisani, ... J. Phys. Conf. Ser. 898, 082045 1 2017 10.1088/1742-6596/898/8/082045
The next generation of exascale-class systems: The ExaNeSt project R Ammendola, A Biagioni, P Cretaro, O Frezza, FL Cicero, A Lonardo, ... 2017 Euromicro Conference on Digital System Design (DSD), 510-515 12 2017 10.1109/DSD.2017.20
Development of Network Interface Cards for TRIDAQ systems with the NaNet framework R Ammendola, A Biagioni, P Cretaro, S Di Lorenzo, M Fiorini, O Frezza, ... Journal of Instrumentation 12 (03), C03037 10.1088/1748-0221/12/03/C03037
GPU-based low-level trigger system for the standalone reconstruction of the ring-shaped hit patterns in the RICH Cherenkov detector of NA62 experiment R Ammendola, A Biagioni, S Chiozzi, P Cretaro, AC Ramusino, ... Journal of Instrumentation 12 (03), C03005 2017
Graphical processors for HEP trigger systems R Ammendola, A Biagioni, S Chiozzi, AC Ramusino, S Di Lorenzo, ... Nuclear Instruments and Methods in Physics Research Section A: Accelerators …
Co-designed Innovation and System for Resilient Exascale Computing in Europe: From Applications to Silicon (EuroEXA) G George, K Nectarios, P Nikela, N Konstantinos, K Vasilis, A Chloe, ... 2017
Power-efficient computing: experiences from the COSA project D Cesini, E Corni, A Falabella, A Ferraro, L Morganti, E Calore, ... Scientific Programming 2017 9 2017
Reconfigurable PCI Express cards for low-latency data transport in HEP experiments R Ammendola, A Biagioni, PS Paolucci, F Lo Cicero, O Frezza, P Vicini, ... Nuovo Cim. 40, 73 2017
GPU real-time processing in NA62 trigger system R Ammendola, A Biagioni, S Chiozzi, P Cretaro, S Di Lorenzo, R Fantechi, ... Journal of Physics: Conference Series 800 (1), 012046 2017
Dynamic many-process applications on many-tile embedded systems and HPC clusters: The EURETILE programming environment and execution platforms PS Paolucci, A Biagioni, LG Murillo, F Rousseau, L Schor, L Tosoratto, ... Journal of Systems Architecture 69, 29-53 17 2016
The exanest project: Interconnects, storage, and packaging for exascale systems M Katevenis, N Chrysos, M Marazakis, I Mavroidis, F Chaix, N Kallimanis, ... 2016 Euromicro Conference on Digital System Design (DSD), 60-67 33 2016
Graphics Processing Units for HEP trigger systems R Ammendola, M Bauce, A Biagioni, S Chiozzi, AC Ramusino, R Fantechi, ... Nuclear Instruments and Methods in Physics Research Section A: Accelerators … 3 2016
GPU-based Real-time Triggering in the NA62 Experiment R Ammendola, A Biagioni, P Cretaro, S Di Lorenzo, R Fantechi, M Fiorini, ... arXiv preprint arXiv:1606.04099 3 2016
NaNet-10: a 10GbE network interface card for the GPU-based low-level trigger of the NA62 RICH detector. R Ammendola, A Biagioni, M Fiorini, O Frezza, A Lonardo, G Lamanna, ... Journal of Instrumentation 11 (03), C03030 15 2016 IEEE:
GPU-based Low-Level Trigger System for Real-Time Cherenkov Ring Fitting R Ammendola, M Martinelli, L Pontisso, L Tosoratto, O Frezza, M Sozzi, ..
Graphics processors in hep low-level trigger systems R Ammendola, A Biagioni, S Chiozzi, AC Ramusino, P Cretaro, ... EPJ Web of Conferences 127, 00011 1 2016
NaNet3: The on-shore readout and slow-control board for the KM3NeT-Italia underwater neutrino telescope R Ammendola, A Biagioni, O Frezza, FL Cicero, M
* E. Pastorelli, et al., "Scaling to 1024 software processes and hardware cores of the distributed simulation of a spiking neural network including up to 20G synapses", (2015) arXiv:1511.09325, http://arxiv.org/abs/1511.09325
* E. Pastorelli, et al., "Impact of exponential long range and Gaussian short range lateral connectivity on the distributed simulation of neural networks including up to 30 billion synapses", (2015) arXiv:1512.05264, http://arxiv.org/abs/1512.05264
* P.S. Paolucci, et al. "Power, Energy and Speed of Embedded and Server Multi-Cores applied to Distributed Simulation of Spiking Neural Networks: ARM in NVIDIA Tegra vs Intel Xeon quad-cores", (2015), http://arxiv.org/abs/arXiv:1505.03015
* R. Ammendola, et al., "Graphics Processing Units for HEP trigger systems", Nuclear Instruments and Methods in Physics Research Section A, (2015), http://dx.doi.org/10.1016/j.nima.2015.11.106
* P.S. Paolucci et al., "Dynamic Many-process Applications on Many-tile Embedded Systems and HPC Clusters: the EURETILE programming environment and execution platforms", Journal of Systems Architecture, Available online 24 November 2015, ISSN 1383-7621, http://dx.doi.org/10.1016/j.sysarc.2015.11.008.
* S. Wesner et al., "Special Section on Terascale Computing", Future Generation Computer Systems, vol. 53 (2015) pag. 88-89, http://dx.doi.org/10.1016/j.future.2015.07.015
* R. Ammendola et al., "ASIP acceleration for virtual-to-physical address translation on RDMA-enabled FPGA-based network interfaces", Future Generation Computer Systems, vol. 53 (2015) pag. 109-118, http://dx.doi.org/10.1016/j.future.2014.12.012
* R. Ammendola et al., "A hierarchical watchdog mechanism for systemic fault awareness on distributed systems", Future Generation Computer Systems, vol. 53 (2015) pag. 90-99, http://dx.doi.org/10.1016/j.future.2014.12.015
* A. Lonardo et al., "NaNet: a configurable NIC bridging the gap between HPC and real-time HEP GPU computing", Journal of Instrumentation, vol. 10 (2015), http://dx.doi.org/10.1088/1748-0221/10/04/C04011
* R. Ammendola et al., Architectural improvements and technological enhancements for the APEnet+ interconnect system", Journal of Instrumentation, vol. 10 (2015), http://dx.doi.org/10.1088/1748-0221/10/02/C02005
* R. Ammendola et al., "A multi-port 10GbE PCIe NIC featuring UDP offload and GPUDirect capabilities.", Journal of Physics, Conference Series, vol 664 (2015), http://dx.doi.org/10.1088/1742-6596/664/9/092002
* R. Ammendola et al, "Hardware and Software Design of FPGA-based PCIe Gen3 interface for APEnet+ network interconnect system", Journal of Physics, Conference Series, vol 664 (2015), http://dx.doi.org/10.1088/1742-6596/664/9/092017
* R. Ammendola, et al., "Fast algorithm for real-time rings reconstruction", (2015) GPU Computing in High-Energy Physics (GPUHEP2014) : Pisa, Italy. http://dx.doi.org/10.3204/DESY-PROC-2014-05/42
* A. Lonardo et al., "A FPGA-based Network Interface Card with GPUDirect enabling real-time GPU computing in HEP experiments", (2015) GPU Computing in High-Energy Physics (GPUHEP2014) : Pisa, Italy, September 2014, http://dx.doi.org/10.3204/DESY-PROC-2014-05/16
* R. Ammendola, et al., "GPUs for the realtime low-level trigger of the NA62 experiment at CERN" (2015) , GPU Computing in High-Energy Physics (GPUHEP2014) : Pisa, Italy, http://dx.doi.org/10.3204/DESY-PROC-2014-05/15
* R. Ammendola et al., "LO-FA-MO: Fault Detection and Systemic Awareness for the QUonG Computing System," in Reliable Distributed Systems (SRDS), 2014 IEEE 33rd International Symposium on, pp.265-270, http://dx.doi.org/10.1109/SRDS.2014.33
* L. Schor et al., "EURETILE Design Flow: Dynamic and Fault Tolerant Mapping of Multiple Applications onto Many-Tile Systems", Parallel and Distributed Processing with Applications (ISPA), 2014 IEEE Int. Symp. on, (2014) pag. 182-189, http://dx.doi.org/10.1109/ISPA.2014.32
* P.S. Paolucci, et al., "EURETILE D7. 3-Dynamic DAL benchmark coding, measurements on MPI version of DPSNN-STDP (distributed plastic spiking neural net) and improvements to other DAL codes" (2014) arxiv:1408.4587, http://arxiv.org/abs/1408.4587
* A. Lonardo, et al., "NaNet: a Low-Latency, Real-Time, Multi-Standard Network Interface Card with GPUDirect Features", (2014) arXiv:1406.3568 [physics.ins-det], http://arxiv.org/abs/1406.3568
* G. Lamanna et al., "GPUs for real-time processing in HEP trigger systems", Journal of Physics: Conference Series, vol. 513, (2014), International Conference on Computing in High Energy and Nuclear Physics (CHEP) 2013, http://dx.doi.org/10.1088/1742-6596/513/1/012017
* R. Ammendola et al., “Analysis of performance improvements for host and GPU interface of the APENet+ 3D Torus network“, Journal of Physics: Conference Series, vol. 513, (2014), Workshop on Advanced Computing & Analysis Techniques in Physics Research (ACAT) 2013, http://dx.doi.org/10.1088/1742-6596/523/1/012013
* S, Chiozzi, et al., "GPUs for Online Processing in Low-Level Trigger Systems", (2014), http://inspirehep.net/record/1360142/files/PoS(TIPP2014)208.pdf
* R. Ammendola et al., "NaNet: a flexible and configurable low-latency NIC for real-time trigger systems based on GPUs“, in JINST, Journal of Instrumentation, vol. 9 (2014), Proceedings of Topical Workshop on Electronics for Particle Physics (TWEPP) 2013, IOP Publishing, 2013, http://dx.doi.org/10.1088/1748-0221/9/02/C02023.
* R. Ammendola, et al., "Architectural improvements and 28 nm FPGA implementation of the APEnet+ 3D Torus network for hybrid HPC systems, (2014) Journal of Physics: Conference Series vol. 513 (5), http://dx.doi.org/10.1088/1742-6596/513/5/052002
* A. Biagioni, et al., "Evolution of FPGA-based network acceleration for GPUs", poster at the conference Perspectives of GPU Computing in Physics and Astrophysics, Rome, Italy, September 15-17, 2014
* A. Biagioni, et al., "Development of a GPU aware NIC: from HPC to HEP experiments", poster at the conference (GTC) GPU Technology Conference, March 24-26, 2013 - San Jose (California)
* P.S. Paolucci, et al., "Distributed simulation of polychronous and plastic spiking neural networks: strong and weak scaling of a mini app benchmark", Poster at the Workshop 'Dagli atomi al cervello', Politecnico di Milano, 27 Jan 2014
* M. Bauce, et al., "GPUs for Online Processing in Low-Level Trigger Systems", Proceedings of Science (TIPP2014) Technology and Instrumentation in Particle Physics 2014, http://inspirehep.net/record/1360142/files/PoS(TIPP2014)208.pdf
* R. Ammendola et al., "NaNet: a flexible and configurable low-latency NIC for real-time trigger systems based on GPUs" (2014) Journal of Instrumentation vol. 9 (2), http://dx.doi.org/10.1088/1748-0221/9/02/C02023
* R. Ammendola et al. "Design and implementation of a modular, low latency, fault-aware, FPGA-based network interface" (2013) Reconfigurable Computing and FPGAs (ReConFig), 2013 International Conference on, IEEE, http://dx.doi.org/10.1109/ReConFig.2013.6732275
* R. Ammendola et al. "Virtual-to-Physical address translation for an FPGA-based interconnect with host and GPU remote DMA capabilities", (2013) Field-Programmable Technology (FPT), 2013 International Conference on, http://dx.doi.org/10.1109/FPT.2013.6718331
* R. Ammendola, et al. "APEnet+ 34 Gbps data transmission system and custom transmission logic" (2013) Journal of Instrumentation 8 (12), C12022, http://dx.doi.org/10.1088/1748-0221/8/12/C12022
* P.S. Paolucci, et al. "Distributed simulation of polychronous and plastic spiking neural networks: strong and weak scaling of a representative mini-application benchmark executed on a small-scale commodity cluster" (2013) arXiv:1310.8478 [cs.DC], http://arxiv.org/abs/1310.8478
* R. Ammendola, et al., "A heterogeneous many-core platform for experiments on scalable custom interconnects and management of fault and critical events, applied to many-process applications: Vol. II, 2012 technical report" (2013) arXiv:1307.1270 [cs.DC], http://arxiv.org/abs/1307.1270
* R. Ammendola, et al., "'Mutual Watch-dog Networking': Distributed Awareness of Faults and Critical Events in Petascale/Exascale" (2013) arXiv:1307.0433 [cs.DC], http://arxiv.org/abs/1307.0433
* R. Ammendola, et al., "GPU peer-to-peer techniques applied to a cluster interconnect", (2013) Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International, http://dx.doi.org/10.1109/IPDPSW.2013.128
* Paolucci, P.S., Bacivarov, I., Goossens, G., Leupers, R., Rousseau, F., Schumacher, C., Thiele, L., Vicini, P., "EURETILE 2010-2012 summary: first three years of activity of the European Reference Tiled Experiment", (2013), arXiv:1305.1459 [cs.DC] , http://arxiv.org/abs/1305.1459 - then published as ISBN: 978-88-908488-0-3 (2013), http://dx.doi.org/10.12837/2013T01
* R. Ammendola, et al. "APEnet+: a 3D Torus network optimized for GPU-based HPC Systems" (2012) Journal of Physics: Conference Series 396 (4), 042059, http://dx.doi.org/10.1088/1742-6596/396/4/042059
* R. Ammendola, et al., "A 34 Gbps data transmission system with FPGAs embedded transceivers and QSFP+ modules" (2012) Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), 2012 IEEE, http://dx.doi.org/10.1109/NSSMIC.2012.6551230
* S. Amerio, et al., "Applications of GPUs to online track reconstruction in HEP experiments", (2012) Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), 2012 IEEE, http://dx.doi.org/10.1109/NSSMIC.2012.6551422
* A. Biagioni, et al. "The Distributed Network Processor: a novel off-chip and on-chip interconnection network architecture" (2012) arXiv:1203.1536 [cs.AR], http://arxiv.org/abs/1203.1536
* R. Ammendola, et al., "APEnet+: High Bandwidth 3D Torus direct network for petaflops scale commodity clusters", (2011), Journal of Physics: Conference Series 331 (5), 052029,
http://dx.doi.org/10.1088/1742-6596/331/5/052029
* R. Ammendola, et al., "QUonG: A GPU-based HPC System Dedicated to LQCD Computing." (2011) In Proceedings of the 2011 Symposium on Application Accelerators in High-Performance Computing (SAAHPC '11). IEEE Computer Society, Washington, DC, USA, 113-122. http://dx.doi.org/10.1109/SAAHPC.2011.15
* R. Ammendola, et al., "APEnet+: a 3D toroidal network enabling Petaflops scale Lattice QCD simulations on commodity clusters" proceedings of The XXVIII International Symposium on Lattice Field Theory PoS(Lattice 2010)022 and arXiv:1012.0253v1
[70] R. Ammendola, A. Biagioni, O. Frezza, F. Lo Cicero, A. Lonardo, P.S. Paolucci, D. Rossetti, A. Salamon, G. Salina, F. Simula, L. Tosoratto, P. Vicini - Mastering multi-GPU computing on a torus network - GPU Technology Conference 2010 (GTC)
[69] Pier Stanislao Paolucci "FP7 EURETILE Project: EUropean REference TILed architecture Experiment" HipeacInfo, Quarterly Newsletter, Number 24, page 11, October 2010 (http://www.Hipeac.net/newslette) https://sites.google.com/site/pierstanislaopaolucci/euretile/PaolucciEuretileHipeacInfo24October2010.pdf
[68] Chagoya-Garzon, A.; Guerin, X.; Rousseau, F.; Petrot, F.; Rossetti, D.; Lonardo, A.; Vicini, P.; Paolucci, P.S.; , "Synthesis of Communication Mechanisms for Multi-tile Systems Based on Heterogeneous Multi-processor System-On-Chips," Rapid System Prototyping, 2009. RSP '09. IEEE/IFIP International Symposium on , vol., no., pp.48-54, 23-26 June 2009
[67] [MPSOC'08] Pier Stanislao Paolucci "Four Levels of Parallelism to be Managed in the DIOPSIS based SHAPES Multi-Tiled Architecture", 8th International Forum on Application-Specific Multi-Processor SoC, MPSOC'08, Aachen Germany 23-27 June http://www.mpsoc-forum.org/2008/slides.html, https://sites.google.com/site/pierstanislaopaolucci/shapes/MPSOC2008_SHAPES_Diopsis_Paolucci.pdf
[66] Gert Goossens, Pier Stanislao Paolucci, Piergiovanni Bazzana, Andrea Ricciardi "VLIW Compilation Environment and Multi-Processor Architecture of Diopsis, the RISC+ floating-Point VLIW DSP System-On-Chip Designed for High Quality Acoustic Applications" Tutorial at - the 41st Annual IEEE/ACM International Symposium on Microarchitecture, 2008. MICRO-41 2008, Lago di Como, Italy. http://www.microarch.org/micro41/
[65] ACM Transaction [TECS2008] Katalin Popovici, Xavier Guerin, Frederic Rousseau, Pier Stanislao Paolucci, and Ahmed Amine Jerraya. 2008. Platform-based software design flow for heterogeneous MPSoC. ACM Trans. Embed. Comput. Syst. 7, 4, Article 39 (August 2008), 23 pages. http://dx.doi.org/10.1145/1376804.1376807
[64] T. Sporer., P.S Paolucci, et al. "SHAPES a Scalable parallel HW/SW Architecture Applied to Wave Field Synthesis", Proceed. of AES 32nd Int. Conf., Hillerod, Denmark, September 21-23, 2007. ISBN 9780937803608 093780360X
[63] K. Popovici, X. Guerin, F. Rousseau, P.S. Paolucci, and A. Jerraya. "Efficient Software Development Platforms for Multimedia Applications at Different Abstraction Levels". (2007) In Proceedings of the 18th IEEE/IFIP International Workshop on Rapid System Prototyping (RSP '07). IEEE Computer Society, Washington, DC, USA, 113-122. http://dx.doi.org/10.1109/RSP.2007.21
[62] P. S. Paolucci, A. A. Jerraya, R. Leupers, L. Thiele, and P. Vicini. "SHAPES:: a tiled scalable software hardware architecture platform for embedded systems". (2006). In Proceedings of the 4th international conference on Hardware/software codesign and system synthesis (CODES+ISSS '06). ACM, New York, NY, USA, 167-172. http://doi.acm.org/10.1145/1176254.1176297
[61] "The Diopsis Multiprocessor Tile of SHAPES" P.S. Paolucci, 6th International Forum on Application-Specific Multi-Processor SoC, MPSOC'06, Estes Park, Colorado (2006) https://sites.google.com/site/pierstanislaopaolucci/shapes/Paolucci_Diopsis_SHAPES_MPSOC06.pdf, http://www.mpsoc-forum.org/2006/index.html
[60] Janus: a Gigaflop RISC + VLIW SoC Tile, Pier S. Paolucci et al. Hot Chips 15 Conference, Stanford, (August 18th, 2003) http://www.hotchips.org/archives/hc15/3_Tue/9.atmel.pdf
[59] mAgic-FPU and MADE: A customizable VLIW core and the modular VLIW processor architecture description environment Pier S. Paolucci, Philippe Kajfasz, et al. Philippe Bonnot, Bernard Candaele, Daniel Maufroid, Elena Pastorelli, Andrea Ricciardi et al., Computer Physics Communications, 139,1 (Sep 2001) 132-143 http://dx.doi.org/10.1016/S0010-4655(01)00235-1
[58] mAgic FPU: VLIW Floating Point Engines for “System-on-Chip” Applications D. Maufroid, P.S. Paolucci, P. Kajfasz, A. Bertini, "Business and Work in the Information Society: New Technologies and Applications" Emmsec 99 Conference Proceedings, (1999) Stockholm, Sweden, Edited by Jean-Yves Roger, Brian Stanford-Smith & Paul T. Kidd. CheshireHenbury (1999) ISBN 90-5199-491-5
[57] The teraflop supercomputer APEmille: architecture, software and project status report F. Aglietti, A. Bartoloni, C. Battista, S. Cabasino, M. Cosimi, A. Michelotti, A. Monello, E. Panizzi, P. S. Paolucci, W. Rinaldi et al. Computer Physics Communications, 110,1-3 (May 1998) 216-219
[56] An Overview of the APEmille Project A. Bartoloni, S. Cabasino*, M. Cosimi, P. De Riso, A. Lonardo, A. Michelotti, E. Panizzi, P. S. Paolucci, D. Rossetti, M. Torelli et al. Nuclear Physics B - Proceedings Supplements, 60,1-2 (Jan 1998) 237-240
[55] The ‘Cubed Sphere’: (A New Method for the Solution of Partial Differential Equations in Spherical Geometry) C.Ronchi, R. Iacono and P.S. Paolucci Journ. Comp. Phys. 124,1 (March 1996) http://dx.doi.org/10.1006/jcph.1996.0047
[54] N-Body Classical Systems and Neural Networks on APE100 Massive Parallel Computers P.S. Paolucci Int. Journ. Mod. Phys. C 6(1995)169
[53] SIMD algorithm for Matrix Transposition N. Cabibbo, P.S. Paolucci Int. Journ. Mod. Phys. C 6(1995)183
[52] Finite Difference Approximation to the Shallow Water Equations on a Quasi-Uniform Spherical Grid C.Ronchi, R. Iacono and P.S. Paolucci B. Hertzberger and G. Serazzi (Editors) High Performance Computing and Networking - HPCN Europe 1995 (Springer-Verlag, 1995)741
[51] A Lattice Study of the Exclusive B->K*g decay amplitude, using the Clover action at b=6.0 A.Abada et al. CERN-TH/95-59 ROMA prep N. 94/1056 submitted to Phys.Lett. B
[50] Quenched BK-parameter with the Wilson and Clover Actions at b=6.0 A.Bartoloni et al. Nucl. Phys B (Proc.Suppl) 42(1995)397
[49] B->K*g decay on APE A.Abada et al. Nucl. Phys B (Proc.Suppl) 42(1995)379
[48] A High Statistics Lattice Calculation of fBstatic at b=6.2 using the Clover Action C.R.Allton et al. Phys. Lett. B 326(1994)295
[47] APE Results of Hadron Masses in Full QCD Simulations S.Antonelli et al. Nucl. Phys. B (Proc.Suppl) 42(1995)300
[46] The new wave of the APE Project: APEmille A. Bartoloni et al. Nucl. Phys. B (Proc.Suppl) 42(1995)17
[45] Lattice Calculations of D- and B-Meson Semileptonic Decays, Using the Clover Action at b=6.0 on APE C.R. Allton et al. Phys. Lett. B 345 (1995) 513
[44] Status of APE100 and Full QCD Simulations S. Antonelli et al. Nucl. Phys. B (Proc. Suppl.) 34(1994)826
[43] D- and B-Meson Semi-Leptonic Decays A. Abada et al. Nucl. Phys. B (Proc. Suppl.) 34(1994)477
[42] Decay Constants of Heavy-Light Mesons C.R. Allton et al. Nucl. Phys. B (Proc. Suppl.) 34(1994)456
[41] Light Quark Physics on Different Lattices C.R.Allton et al. Nucl. Phys. B (Proc. Suppl.) 34(1994)360
[40] Polyakov Loops and Finite-Size Effects of Hadron Masses in Full Lattice QCD S.Antonelli et al. Phys. Lett B 345(1995)49
[39] A High Statistics Lattice Calculation of fB in the static limit on Ape b=6.2 using the Clover Action C.R.Allton et al. Nucl. Phys. B 413(1994)461
[38] A Hardware Implementation of the APE100 Architecture A. Bartoloni et al. Int. Journ. Mod. Phys. C 4(1993)969
[37] The Software of the APE100 Processor A. Bartoloni et al. Int. Journ. Mod. Phys. C 4(1993)955
[36] The APE100 Computer: (I) the Architecture C. Battista et al. Int. Journ. of High Speed Computing 5(1993)637
[35] Preliminary Results from APE100 A. Bartoloni et al. Nucl. Phys. B (Proc. Suppl.) 30(1993)469
[34] Dynamic Parsers and Evolving Grammars S. Cabasino, P.S. Paolucci, G.M. Todesco ACM SIGPLAN Notices 27(1992)
[33] Dynamic Parsers, Evolving Grammars and Incremental Languages S. Cabasino, P.S. Paolucci, G.M. Todesco Internal Note N. 863 Dip.Fisica Univ. Roma (1992)
[32] Physics with APE100: First Experiences P.S. Paolucci C.Verkerk and W.Wojcik (Editors) Computing in High Energy Physics '92 (Annecy, 1992) CERN 92-07 (CERN,Geneve, 1992)365
[31] Simulazione di flussi 3-D su super-computer paralleli APE100 con solutore esplicito F. Mandolini, P.S. Paolucci, C. Bruno Private Communication (Roma, 1992)
[30] LBE Simulations of Rayleigh-Benard Convection on the APE100 Parallel Processor A. Bartoloni et al. Int. Journ. Mod. Phys. C 4(1993)993
[29] A High Performance Single Chip Processing Unit for Parallel Processing and Data Acquisition Systems G. Bastianello et al. Nucl. Instr. and Meth. in Phys. Res. A 324(1993)543
[28] APE Quenched Spectrum S. Cabasino et al. Nucl. Phys. B (Proc. Suppl.) 20(1991)399
[27] b=6.0 Staggered Quenched Fermions S. Cabasino et al. Phys. Lett. B 258(1991)202
[26] b=6.0 Quenched Wilson Fermions S. Cabasino et al. Phys. Lett. B 258(1991)195
[25] MAD, a Floating Point Unit for Massively -Parallel Processors A. Bartoloni et al. Particle World 2(1991)65
[24] From APE to APE-100: from 1 to 100 Gigaflops in Lattice Gauge Theory Simulations N. Avico et al. Comput. Phys. Commun. 57(1989)285
[23] System and Applicative Software for the APE Computer Family P. Bacilieri et al. Workshop A.I.C.A. (Bologna,1989)
[22] From APE to APE100: Present and Future of the APE Project P. Bacilieri et al. M.Budinich, E.Castelli and A. Colavita (Editors) International Conference on the Impact of Digital Microelectronics and Microprocessors (Trieste, 1988) (World Scientific,1988)
[21] The APE Computer : A Fast Array Processor M. Albanese et al. Atti della Commemorazione di E. Amaldi (Roma,1990)
[20] Status of Quenched QCD on APE Computers S. Cabasino et al. Nucl. Phys. B (Proc. Suppl.) 16(1990)554
[19] The APE with a Small Mass S. Cabasino et al. Nucl. Phys. B (Proc. Suppl.) 17(1990)431
[18] The APE with a Small Jump S. Cabasino et al. Nucl. Phys. B (Proc. Suppl.) 17(1990)218
[17] Staggered Fermions at b=5.7 : Smeared Operators on Large Lattices P. Bacilieri et al. Nucl. Phys. B 343(1990)228
[16] APE: a Fast Array Processor for Physics Simulations P. Bacilieri et al. L.P. Kartashev and S.I.Kartashev (Editors) Third International Conference on Supercomputing (Boston, 1988) International Supercomputing Institute 1(1988)157
[15] A New Computation of the Correlation Length near the Deconfining Transition in SU(3) P. Bacilieri et al. Phys. Lett. B 224(1989)333
[14] The Deconfining Phase Transition and the Glueball Channels in Pure Gauge QCD P. Bacilieri et al. Phys. Lett. B 220(1989)607
[13] On the Order of Deconfining Phase Transition in Pure Gauge QCD P.Bacilieri et al. Nucl. Phys. B 9(Proc. Suppl.)(1989)315
[12] The Deconfining Phase Transition in Lattice Gauge SU(3) P. Bacilieri et al. Nucl. Phys. B 318(1989)553
[11] Order of Deconfining Phase Transition in Pure-Gauge QCD P. Bacilieri et al. Phys. Rev. Lett. 61(1988)1545
[10] The Hadronic Mass Spectrum in Quenched Lattice QDC: b=5.7 P. Bacilieri et al. Nucl. Phys. B 317(1989)509
[9] The Hadronic Mass Spectrum in Quenched Lattice QCD: Results at b=5.7 andb=6.0 P. Bacilieri et al. Phys. Lett. B 214(1988)115
[8] Eigenstates and Limit Cycles in the SK Model S. Cabasino, E. Marinari, P.S. Paolucci and G. Parisi J. Phys. A, Math. Gen. 21(1988)4201
[7] Scaling in Lattice QCD: Glueball Masses and String Tension P. Bacilieri et al. Phys. Lett. B 205(1988)535
[6] Glue Ball Masses and the Loop-Loop Correlation Functions M. Albanese et al. Phys. Lett. B 197(1987)400
[5] Glue Ball Masses and String Tension in Lattice Q.C.D. M. Albanese et al. Phys. Lett. B 192(1987)163
[4] A 3-D Filtered Back-Projection Algorithm for a 3-D PET M. Conti et al., V.Cappellini and A.G.Constantinides (Editors) Digital Signal Processing 87 (Elsevier Science Publishers B.V., North-Holland, 1987
[3] The Ape Computer: an Array Processor Optimized for Lattice Gauge Theory Simulations M. Albanese et al. Comp. Phys. Comm. 45(1987)345
Abstract: The APE computer is a high performance processor designed to provide massive computational power for intrinsically parallel and homogeneous applications. APE is a linear array of processing elements and memory boards that execute in parallel in SIMD mode under the control of a CERN/SLAC 3081/E. Processing elements and memory boards are connected by a `circular' switchnet. The hardware and software architecture of APE, as well as its implementation are discussed in this paper. Some physics results obtained in the simulation of lattice gauge theories are also presented.
[2] The Ape Project: a Gigaflop Parallel Processor for Lattice Calculations. Bacilieri P; Cabasino S; Marzano F; Paolucci P; Petrarca S; Salina G; Cabibbo N; Giovannella C; Marinari E; Parisi G; Costantini F; Fiorentini G; Galeotti S; Passuello D; Tripiccione R; Fucci A; Petronzio R; Rapuano F; Pascoli D; Rossi P; Remiddi E; Rusack RW, Computing in High Energy Physics 85, Conference, Amsterdam 25-28 Jun, 1985 (Elsevier Science Publishers B.V., North-Holland, 1986) L.O. Hertzberger and W. Hoogland (Editors) pag.330-337 ISBN: 0444879730 http://cdsweb.cern.ch/record/108031
Abstract: A new special purpose parallel processor (APE) presently under development is presented. The theoretical computing power of the processor is 1 Giga-flops and the memory can be expanded to 512 Megabytes. Sixteen 32 bit floating point processors each with a computing power of 64 Mega-flops are driven in parallel as a single instruction multiple data (SIMD) machine under the control of a 3081/E. Each floating-point unit is connected to three 8 Megabyte memories which can also be accessed by the 3081/E. Though this machine can be used as a general purpose array processor the hardware has been optimized for lattice QCD calculations.
[1] Proposal of a Computer for Lattice Calculations P. Bacilieri et al. Nota interna ROM2F/85/6 (Roma, 1985) Dip. di Fisica, Universita' di Roma II
Acknowledgements:
This site is dedicated to my wife Antonella and my daughter Fiamma Flavia and to my parents Birgit and Ermete. The human being is what his brain learned from the interaction with other humans, institutions and mother nature (items marked with * acted on me only by media, the list is written in time reversal appearance order): Maurizio Mattia, Paolo Del Giudice, Lothar Thiele, Gert Goossens, Rainer Leupers, Frederic Rousseau, Nancy B. Green, Antonio Cerruto, Fabio Acerbi, Massimiliano Testa, Gregorio Magnanti, Kyuzo Mifune*,Jigoro Kano*, Piergiovanni Bazzana, Ipitec s.r.l., Fiamma Flavia Paolucci, Eugenio Guarino, Yves Fusella, Philippe Kajfasz, Bernard Candaele, Lucio Russo*, Andrea Baiocchini, Andrea Ricciardi, Ben Altieri, Paola Pasquini, Noemi Primieri, Shirine Rofougaran, Rosina Tonti, Navone Pasquini, Nergal s.r.l., Elena Pastorelli, Giovanni Stefani, Davide Rossetti, Alessandro Lonardo, Piero Vicini, Mariano Manenti, Carlo Iodice, Stefano Pratesi, Riccardo Simonazzi, Valeria Ghisalberti, Thera s.r.l., Emanuele Panizzi, Giuseppe Bastianello, Antonella Pasquini, Mario Torelli, Marcello Cini, Gian Marco Todesco (link to his "toonz"creature), Raffaele Tripiccione, Enzo Marinari, Nicola Cabibbo, Giorgio Parisi, INFN Roma, Itaca s.r.l., Francesca Grassi, Francesco Danza, Simone Cabasino, Johann Sebastian Bach*, Richard Feynman*, Claudio Spina, Carlo Bernardini,Giorgio Salvini, Francesco Severi, Walter Tross, Dip. di Fisica dell'Universita' di Roma, Federica Sampo', Erminia Belli, Albert Einstein*, Le Scienze (Scientific American), Camillo Marra, Francesca Santoni, Maria Banci, Hector Hawton*, Corrado Sampo', Liceo Classico Cristo Re di Roma, Jacques Yves Cousteau*, Paola Fabbri, A.S. Roma Nuoto, Carlo Sardella, CONI F.I.N. Nuoto Roma, Elisabetta Horvath, Ludwig Van Beethoven*, CONI FIN Milano, Suor Giuseppa, Argia Savi, Carlo Paolucci, Ellen Mühlenfeldt, Ermete Paolucci, Birgit Kirsten Mühlenfeldt