Project Definition

Introduction

The behavior and communication pattern of parallel workloads is essential to allow their execution on many-core processors

and in high performance computing. The characterization of workloads aims mainly in finding answers to indicate the size of applications, network packets, protocols influence, communication patterns, etc. In this context, it is important to know the workload, the way it was programmed, the periodicity of its execution and communication, which nodes were used, when and for how long, the communication pattern for the programming model, the size of messages and the data handled in memory besides the kind of requested parallelism (instruction or thread). The characterization should be based in an large set of parallel workloads and programming languages (OpenMP, MPI, TBB, etc). Thus, the execution and analysis of differents benchmarks (SPEC, SPLASH, NAS and PARSEC) in different languages should be made in this project. Additionally, the proposal of new benchmarks for many-core architectures is aimed.

The next generation of many-core processors have the capacity of supporting both shared memory and message passing programming models. However, the research groups do not know yet the impact of these programming models on chips with high number of cores. Furthermore, there is no studies about how NoCs and memories (Non Uniform Memory Access and Non Uniform Cache Access) will scale for the programming models on such chips. Workloads have been developed and are ready for multiprocessor computers, multi-core processors, and clusters. There is no knowledge about the characteristics of workloads for many-core processors. Thus, it is necessary to evaluate the performance of workloads based on the characterization that will be done in this project. With such results proposes many-core architectures.

Usually the degree of confidence in the assessment of performance grows as the results are obtained through analytical models, simulations and measurements. However, according to the context of this project, the many-core processors do not exist, or are not accessible because they are objects of study in industry. Therefore, the models to be used in this project are: analytical and simulation. There are a number of simulators called the full system, that can lead with all the characteristics of complete computer systems, such as architecture, operating system, compilers and execution of real programs. These simulators are seen by academia and industry as highly reliable and effective in assessing the performance of the prototype before the real system. However, these simulators may have restrictions, or lack of flexibility, which may not allow the modeling of a many-core machine. In this case, the use of more than one simulation environment is necessary, besides analytical models.

Therefore, in this project we will study parallel workloads and many-core machines. The results of such studies will allow us to perform the evaluation of parallel workloads in many-core architectures using analytical model and simulation. Such performance evaluation will let us to identify or propose an many-core architecture that better supports parallel workloads. Additionally, the characterization and proposal of new workloads for many-core architectures is one of the aims of this project.

Objectives

This proposal of cooperation between PUC Minas and LIG-INRIA aims the exchange of experiences and knowledge between both countries. Thus, the main goals of this project are: (i) exchange knowledge and works in the context of characterization and evaluation of parallel workloads for many-core architectures, (ii) missions in both institutions for Brazilian and French students and professors, (iii) present the results obtained with the works in international conferences and (iv) improve the cooperation between Brazil and France.

Parallel Workload Characterization

The main goals of parallel workload characterization are: (i) run several experiments with the benchmarks cited in section 2 for clusters composed of many-cores and GPUs like in the works [32, 14]. For these experiments, we will use shared memory programming model and message passing, (ii) study the behavior and the communication patterns of the workloads, (iii) characterize the workloads using constraints presented in section 2, (iv) propose and develop some parallel workloads for many-core architectures using the knowledge obtained in ii and ii items.

Many-core Architectures Studies

The main goals of may-core architectures are: (i) study the state of art of many-core architectures, (ii) study the proposals for Networks-on-Chip (NoC) for many-core machines, (iii) study the memory subsystem for many-core machines, (iv) propose and develop protocols for communication between cores and memory, (v) develop a architecture model for a many-core processor with memory and NoC embedded in the chip.

Performance Evaluation

The main goals of performance evaluation of parallel workloads on many-core architectures are: (i) propose analytical models for many-core architectures, (ii) identify and select some tools to simulate real workloads on many-core architectures, (iii) evaluate the performance of parallel workloads using analytical models and simulation, (iv) analysis the constraints of many-core architectures and propose some solutions, (v) propose and validate a workload model for manycore architectures and (vi) validate the proposal of a many-core architecture.

Measurable Objectives:

The schedule of activities presented on the table bellow is divided according to each area of this project. It is important to note that the dates do not have intersections because it is periods with emphasis of study. However, there is a correlation and dependence between areas that can be described as follows: task 3 has a longer time, but demand back to tasks 1 and 2. Similarly, in task 1, the study of many-core architectures is associated with the task characterization of workloads. In task 2 we focus on proposals of simulations. Thus, there is a link between tasks 2 and 3 concerning analytical models and simulation to evaluate many-core architecture aspects, for example, NOC and NUCA.

Task Start Date End Date Description

1 01-01-2010 31-05-2010 Parallel Workload Characterization

2 01-06-2010 30-11-2010 Many-core Architectures Studies

3 01-12-2010 01-10-2011 Performance Evaluation of Parallel Workloads on Many-core Architectures

4 01-10-2011 01-12-2011 Rapports and Papers

Besides the activities schedule, some missions of professors from Brazil and France are expected. This missions is a way to increase the interaction between the teams. Participation in seminars and workshops are included to increase assistance and dissemination of research in the academic community.

Deliverables

    1. Technical Report INRIA Sep. 2010

    2. Technical Report INRIA Nov. 2011

    3. Technical Report FAPEMIG Nov. 2011