A-LOOP: AMP system: 2-cores ARM Cortex A9/Linux OS and 4-cores Leon3/Linux OS, OpenMP library and Ha

Page 1

A - LOOP

AMP system: 2-cores ARM Cortex A9/Linux OS and 4-cores Leon3/Linux OS, OpenMP library and Hardware Profiling system G. Valente, V. Muttillo, A. Bufalino, M. Santic, L. Pomante, M. Faccio, F. Federici

OVERVIEW A-LOOP IS A SYSTEM ON MODULE (SoM) PROTOTYPE FOR AEROSPACE APPLICATIONS, DEVELOPED STARTING FROM ZYNQ7000, THAT FOCUSES ON THE INTERACTIONS BETWEEN 2 MODULES ("ISLES OF COMPUTATIONAL ELEMENTS") AND ON THE MONITORING ACTION WITHOUT OVERHEAD INSERTION MOTIVATIONS

PROPOSED PLATFORM

1) Embedded systems development is driven by basic functional specifications, enriched with a

Proposed platform represents a SoM with 2 modules that share a memory region on external memory:

set of non-functional requirements (performances, power dissipation, etc.). 2) One of the techniques that can be exploited is to develop Isles of computational elements (Mo-

-> ISLE #1: a dual-core ARM Cortex A9 with SMP Linux OS, able to interface with external world,

dules) with different characteristics, each one able to satisfy some non-functional specifica-

provides data to Isle #2 and collects results from it. It is also able to monitor performances

tions, in order to realize smart System On Modules (SoM).

of Isle#2, without introducing software overhead, by means of a hardware profiling system.

3) SoC with FPGA can be viewed as platforms useful to prototype these kind of architectures.

-> ISLE #2: a quad-core Leon3 with SMP Linux OS, able to execute parallel applications based on OpenMP library. In particular, in this demo, it executes a MANET localization algorithm.

SYSTEM DESCRIPTION THE CONCEPT ISLE Custom FFT SHARED MEMORY

ISLE INPUT MANAGEMENT

To host

MONITORING OPERATIONS DYNAMIC ADAPTION OF THE SYSTEM

ISLE Localization algorithm

ARM

SNIFFER BLOCK DIAGRAM

IEEE-754 FPU Co-Processor HW MUL/DIV Local IRAM ITLB

LEON3 7 - Stage Integer Pipeline I-Cache D-Cache SRMMU AHB I/F

Memory Controller

LEON3

Local DRAM DTLB

AHB - Adapter Event Monitor APB Bus

APB Interface

S2

LEON3

Decode Section

AMBA AHB Master (32-bit)

Counter Time Monitor

OPENMP Libraries required to execute shared memoryparallel applications, developed with OpenMP C/C++, have been cross-compiled and added to the adopted Leon3 Linux distribution. Non-parallel region: Master thread only

LEON3

ID:0

AMBA AHB S1

AHB Bus

Trace Buffer Debug port Interrupt port

Isle #1 runs a customized SMP Linux distribution provided by Xilinx. Isle #2 runs a customized SMP Linux distribution provided by Gaisler.

LEON3

PHY

A distributed hardware profiling system has been developed for runtime analysis. It is composed of distributed AHB bus monitoring elements (sniffers) that monitor AHB bus, initialized by means of an AXI bus. A global monitor unit, represented by Isle #1, provides sniffers initialization and collects results.

LINUX

ISLE #1 ARM

The LEON3 processor is designed for Embedded applications, combining high performance with low complexity and low power consumption. The LEON3 processor is highly configurable.

ISLE Accelerator

ISLE #2

SRAM

HW PROFILING SYSTEM

3-Port Register File

PROPOSED ARCHITECTURE

Memory Controller AXI Controller Ethernet MAC

LEON3

Parallel region starts: #pragma omp parallel

S3

AHB/APB Bridge

UART

ID:0

ID:1

ID:2 ID:0

UART - USB

Program reverts to single threaded execution

ID:3

Parallel region ends: program waits for all threads to terminate

fork Parallel region: Several thread execute simultaneously join

SYSTEM BEHAVIOUR Proposed profiling technique, used to monitor computational behaviour of the A-LOOP platform, follows the approach of runtime bus sampling. Event monitor: strobe generation (ld_ac_event) during access on specified address range (delimited by sig_out_inf and sig_out_sup). Time monitor: counter activated by read operation (during_read) and stopped by write operation (during_write), both on specified address (0x808).

Main Contacts: giacomo.valente@graduate.univaq.it, vittoriano.muttillo@graduate.univaq.it, andrea.bufalino@student.univaq.it, marco.santic@univaq.it, luigi.pomante@univaq.it, marco.faccio@univaq.it, fabio.federici@univaq.it,

UNIVERSITA’ degli STUDI dell’AQUILA - CENTER of EXCELLENCE DEWS (ITALY)

Perfomance evaluation of the platform by means of Pi calculation algorithm, proposed in four different versions: serial computation, single process multiple data (SPMD) technique with false sharing, SPMD technique without false sharing and OMP reduction function. LEGEND: 1 Thread 2 Threads 3 Threads 4 Threads

http://dews.univaq.it


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.