0910 0118 0904 0501 0101080

Page 1

Dr.M.Sangeetha et al., International Journal of Advanced Research in Innovative Discoveries in Engineering and Applications[IJARIDEA] Vol.1, Issue 1,27 October 2016, pg. 22-27

Control Construct Estimation For Partitioned Binaries In Codesign System Dr.M.Sangeetha¹ Department of Electronics and Communication Engineering, Bharath University, Chennai, India¹ sang_gok@yahoo.com¹

Abstract— The high level description is transformed into software binaries and partitioned into Software and Hardware. The software sits in Microprocessor and hardware in Custom Hardware. An estimator is designed to estimate the partitioned hardware. It improves performance, energy and power for many benchmarks. The single chip Microprocessor/FPGA is more attractive for its performance. Many works are reported in Source Level Partitioning. There are few works in Binary Level Partitioning. Keywords— Assembly language, Co design system, Decompilation.Estimation, Software binaries. I. INTRODUCTION

The behavioral description is transformed into software binaries which are partitioned into hardware and software. The partitioned software module runs on microprocessor and hardware module runs on custom processor. The partitioned hardware is implemented as Application Specific Integrated Circuits or FPGA. The interface between hardware and software is generated. The RAM memory is created for partitioned Hardware. Previous work in Hardware/Software partitioning shows more advantages in embedded system design. Tool flow problem exists while partitioning in commercial environments, so binary partitioning is proposed as a solution to the problems. The partitioning in software binaries results as in Fig. 1 is competitive with source code partitioning, by referring previous work in decompilation.

Fig.1 Co-design System

II. PREVIOUS WORK

Hardware/software partitioning is implemented in few decades and many automated commercial products are available. Estimator is designed to estimate the control construct and is verified for different control construct [1] [3][4]. [5] shown estimation for control 22 © 2016, IJARIDEA All Rights Reserved


Dr.M.Sangeetha et al., International Journal of Advanced Research in Innovative Discoveries in Engineering and Applications[IJARIDEA] Vol.1, Issue 1,27 October 2016, pg. 22-27

construct using BNG and data construct is represented. The intermediate representation is shown in [2]. [6] control flow and data flow based scheduling is detailed. III. ASSEMBLY LEVEL PARTITIONING

The source code is read by compiler front end which is then converted into intermediate format. The profiled data is obtained from the intermediate format. A partitioner designed to partition the critical kernel from non-critical blocks. The critical regions are identified from the profiled data which is partitioned to implement in hardware. Equivalent hardware is fed to the synthesis tool which generates Hardware implementation. The partitioned software module generates assembly code through a compiler back-end for a processor. The estimation of hardware is carried out after the hardware module is synthesized. The generated hardware is complex, the synthesis takes more time. Inorder to make it simple an estimator is designed for partitioned hardware. IV. INTERMEDIATE REPRESENTATION

The behavioral description is compiled and equivalent software binaries are generated. The binary is partitioned into hardware and software module. Intermediate representation is obtained for binaries partitioned for hardware. The intermediate format used in this paper is Control Data Flow Graph (CDFG). Few optimization techniques like Dead code elimination and copy propagation are used in the intermediate representation. The control construct of the partitioned hardware is estimated from CDFG. The proposed work does not support the estimation of data construct. A. Behavioral Network Graph (BNG)

In hardware oriented approach, the simulated design undergoes High Level Synthesis (HLS) and Logic Synthesis. The design methodology for High Level Synthesis (HLS) and Logic Synthesis is different so it generates the unoptimized netlist. So optimized estimation is not possible. If HLS and Logic Synthesis takes up same intermediate format the resulting netlist would be optimized one. The behavioral description transformed in CDFG and equivalent BNG is generated from CDFG. BNG is used in the design methodology. The equivalent hardware is generated from Behavioral Network Graph (BNG) directly. So optimization is done only in BNG. So actual hardware can be estimated. Unlike the behavioral description, in the BNG RTL networks, the states are the registers and their number and transition relations are known apriori. All possible schedules can be represented with BNG. The different scheduling algorithms can be used to interpret the representation in BNG. Logic network behavior can be represented using BNG. Various behaviors are derived for logic network and for synthesizing the Boolean algebra is used. The CDFG is scheduled using different scheduling algorithm, resource sharing and allocation is performed in high level synthesis and logical transformation is done that effectively unify the behavioral and logical domains. The equivalent BNG for CDFG converts into RTL generation. The main limitation in BNG is that it works for fixed CDFG as inputs, it cannot identify dynamically according to the hardware specification. To generate a fixed CDFG, certain optimization techniques like loop unfolding and parallelism extraction are required. Equivalent BNG is generated for unfolded CDFG. Few steps are carried out to generate the hardware. The estimation is performed only for the control construct. Each node in CDFG is represented with State Value Node. B. State Value Node

Each node in CDFG is replaced with state value node(STN). Each state value node is associated with state cut variable. The state cut is placed according to the hardware resources available. The node itself represents register and multiplexer. Whenever state cut is ‘1’, the

23 © 2016, IJARIDEA All Rights Reserved


Dr.M.Sangeetha et al., International Journal of Advanced Research in Innovative Discoveries in Engineering and Applications[IJARIDEA] Vol.1, Issue 1,27 October 2016, pg. 22-27

node it is replaced with register. If state cut is ‘0’, the node is replaced with a wire. Fig.2. shows the state value node(STN) with its state cut variable(SCi).

Fig. 2 State Value Node

C. Matrix Representation

The specification which is converted into CDFG is transformed into matrix. The node which has successor is considered and plotted in matrix. Row represents the current node and column represents the successor node. The current node and successor node is shown in the matrix. If there are more than one indications in a row, it is replaced with ‘&’ symbol, if it is rows its column should be increased according to the number of indications. For example, if there are two indications in a column, the successor should be increased with n-1 columns and represented with ‘+’ symbol. The limitation in matrix format is that as the number of CDFG node increases the matrix increases. D. Generating BNG from CDFG

Each node in CDFG is replaced with State Value Node (STN). The state value node connects predecessor and successor. State value node itself represents a register and multiplexer. It is resolved into either register or wire depending on the state cut. The hardware estimation is performed based on the following rules: 1.Each node is associated with SCi variable. 2. Traverse the CFG and for each control flow node i with a single predecessor and a single successor, create a STNi net and represent the control signal activating the operation in control flow node i. 3. For join nodes, create a state value node STNij for each predecessor edge and connect all STNij nets to a single OR gate . The output of the OR gate is called net STNi. 4. For nodes with multiple successor edges (fork nodes), create a value node STNi and connect its output net to as many AND gates as successor edges. Each AND gate has two inputs: the first input is net STNi (for the fork node) and the other input is a net representing the corresponding successor edge. This condition net may be a primary input or a net from the data path. The output of each AND gate is called net STNij. 5. Connect the multiple STNi boxes in the same topology as the CFG. Extra STNi boxes are created for each predecessor edge in a join node. It is needed in order to allow the state cuts to be placed on each edge independent of the others. V. EXPERIMENTAL RESULTS

A simple CFG is drawn for IF construct as shown in Fig.3. Each node in the graph represents the each line in assembly language. A sample of 10 nodes is taken in the graph. The hardware resources are shown in the graph from assembly language. Four adders are there in the graph. The edge in each node represents the state cut variable (SCi). The CFG is given as an input as represented in Figure 3. The equivalent BNG representation is obtained. Each node is represented with state value node with state cut variable.

24 © 2016, IJARIDEA All Rights Reserved


Dr.M.Sangeetha et al., International Journal of Advanced Research in Innovative Discoveries in Engineering and Applications[IJARIDEA] Vol.1, Issue 1,27 October 2016, pg. 22-27

Fig.3. Control Flow Graph for IF Construct

The generated complete BNG represents the hardware resources and state cut (SCi) in state value node(STN). For single resource, a state cut need to be placed in SC1, SC2,SC4 and SC5. If state cut variable is assigned ‘1’ , the state value node is reduced into register. Otherwise the state value node is reduced into wire. Equivalent FSM representation with one hot encoding is shown in the Fig.4.

Fig.4. FSM for IF Construct

The estimated hardware for the control construct is shown in the Figure 5 which will be obtained in the simulation as expression. So equivalent Register, AND gate and OR gate can be estimated for control construct.

25 © 2016, IJARIDEA All Rights Reserved


Dr.M.Sangeetha et al., International Journal of Advanced Research in Innovative Discoveries in Engineering and Applications[IJARIDEA] Vol.1, Issue 1,27 October 2016, pg. 22-27

Fig.5. Generated Hardware for Simple IF Construct

The data construct is not estimated. The reduced control construct has four registers, two input AND gate and an OR gate. The estimation can be carried out for more than one hardware resources with one hot encoding. The nodes in CFG are represented with its successor as shown in Fig. 6. The nodes with its successor are taken in matrix format. The rows show the current node and columns shows the successor nodes. The fork node is replaced with AND function and when join node is found the row and column are extended. The successors for extended row are replaced with OR function. The expanded matrix shows the complete BNG representation which helps for reduction of hardware. A set of benchmarks in High level language is compiled and equivalent assembly language is generated using SDCC compiler. Estimation is carried out for the software binaries of the benchmarks. VI. CONCLUSION

The high level language is transformed into binary level using SDCC compiler and profiled using GPSIM compiler. From the profiled data, the critical kernel is identified and decompiled into hardware description language. The control construct estimation for this decompiled binary is carried out for benchmarks. Figure 6 shows control data flow graph for the partitioned kernel of DCT. A

Fig.6. Control flow graph for the critical kernel for DCT

26 Š 2016, IJARIDEA All Rights Reserved


Dr.M.Sangeetha et al., International Journal of Advanced Research in Innovative Discoveries in Engineering and Applications[IJARIDEA] Vol.1, Issue 1,27 October 2016, pg. 22-27

Table I shows the estimates on registers, AND gates and OR gates for partitioned benchmarks. The number of registers and gates could be reduced with resource constrained scheduling. Since, availability of single adder/subtractor is given as a constraint to an estimator, a state cut will be placed at 1a, 3, 4, 5, 6, 8, 9, 13, 16 and 19 in the partitioned hardware module for one adder/subtractor. Estimation is carried out for different Benchmarks. TABLE II ESTIMATION FOR BENCHMARKS

VII.

FUTURE WORK

The estimation may be extended to data BNG and may be implemented to simple application. [1] [2]

[3]

[4]

[5] [6]

REFERENCES M.Sangeetha, J.Raja Paul Perinbam, International conference on Signal Processing, Communications and Networking, Estimation of control logic for binary synthesis, 2008. Sangeetha M., Tharini C., Rajapaul Perinbam J., ‘Equivalent Design Representation and Transformation for High Level Synthesis’ online proceedings of 2nd IEEE RTAS Workshop on Model-Driven Embedded Systems (MoDES '04) conference conducted in Ontario, Toronto, Canada, May 2004. Sangeetha M., Revathy S. and Rajapaul Perinbam J., ‘Hardware Estimation and Synthesis for a Codesign system’, Proceedings of IEEE co-sponsored 1st International Conference on Signal Processing, Communications and Networking (ICSCN 2007), Chennai, February 22-24, 2007, pp 414-419. Sangeetha M., Rajapaul Perinbam J. and Kumaran.M., ‘Estimation of control logic for binary synthesis’, In proceedings of IEEE cosponsored 2nd International Conference on Signal Processing, Communications and Networking (ICSCN 2008), February 22-24, 2008. pp 454 - 457. R. A. Bergamaschi (2002) "Bridging the domains of high level and logic synthesis” in Proc. Trans. Computer-Aided Design, vol. 21, pp.582–596. R. A. Bergamaschi and S. Raje, (1997) “Control flow versus data-flow-based scheduling: Combining both approaches in an adaptive scheduling system”, IEEE Trans. Very Large Scale Integration Syst., Vol.5

27 © 2016, IJARIDEA All Rights Reserved


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.