Home

EXPRESSION User Manual version 1.0

image

Contents

1. Figure 6 ALU Execute Unit Instruction In The number of instructions coming into the unit per cycle Instruction Out The number of instruction dispersed out of the unit 23 EXPRESSION User Manual 2003 ACES Laboratory Custom Properties Other miscellaneous properties For instance ARGUMENT _UNIT_ needs to be specified for the execute units and is used in pipeline trailblazing 4 1 1 2 Latch Properties x Name Class Name FeiD ecLatch InstStiLatch Port Type C Input Output C Other Custom Properties Unda Figure 7 Setting Latch Properties A pipeline latch is characterized by properties that are displayed in the Properties window that is displayed when any latch is clicked on the screen The fields shown above are described below Name name of the latch Class Name class that the latch belongs to Can be one of InstStrLatch carry instructions from Instruction Memory InstructionLatch carry instructions prior to decode and OperationLatch carry decoded instructions Port Type whether the direction of transfer is into or out of the 24 EXPRESSION User Manual 2003 ACES Laboratory latch Each unit has latches associated with it and these are specified by creating a latch within the unit which specifies an output or other type latch or a connection from a latch in another element to the unit implicitly specifying an input from the latch of that
2. EXPRESSION User Manual 2003 ACES Laboratory Project Settings o 21xl Settings For Win32 Debug General Debug C C HE acesMIPS Base Class Lib E E E acesMIPS Build System Lib Category ER Best EX acesMIPS Derived Class Lib tput f E ES acesMIPS Simulator Functions Lib ah nane a aces MIPS console facesMIPS dll bin acesMIPS console exe EN acesMIPSdi Object library modules E acesMIPSfuncSimulator E expression console See a IV Generate debug info J Ignore all default libraries graphviz E peProGUI M Link incrementally M Generate mapfile Enable profiling kemel32 lib user32 lib gdi32 lib winspool lib comdlg32 lib ad Project Options kemel32 lib user32 lib gdi32 lib winspool lib comdlg32 lib A adwapi32 lib shell32 lib ole32 lib oleaut32 lib unid lib odbc32 lib odbccp32 lib kernel32 lib user32 lib gdi32 lib y Cancel Figure 1 Setting Output File Name under the Link tab 4 Set peProGUI as the active project This project contains the GUI front end 5 Compile pcProGUI project and run pcProGUI exe by pressing FS 6 Click File gt new followed by Architecture gt new The project is now ready to load an existing architecture description or create a new architecture from scratch 7 The schematic description of acesMIPS is stored in acesMIPS gmd and the instruction set description in acesMIPS isd Load lt run gt acesMIPS gmd by choosing Load grap
3. Machine Description and Load Instruction Set Description respectively 2 Click on the storage component whose size is to be changed 3 Change the value in the Size Line Size Word Size field x gt Type Register File ICache C DCache SRAM DRAM Name Class Name fG PRFile E torage Width Size be o o o y ES X Associativity Bache Lines D fi r Custom Properties ia Time CAPACITY 32 fo Address Prange Mnemonic From fo F of Unda Apply Figure 49 Changing size 4 Save Expression Machine description into acesMIPS xmd 68 EXPRESSION User Manual 2003 ACES Laboratory 5 Repeat the steps described in Section 2 3 to evaluate the modified architecture Expected Result Increasing storage component size may not improve the performance if the majority of the data used by the application program fits in lower size Power should definitely vary with size However power also depends on number of read writes Hence increasing size may not increase power drastically if number of read writes remains same Careful analysis of tradeoffs needs to be done to understand performance figures obtained by changing storage sizes 5 3 4 Adding Deleting Memory Modules It is possible to add new memory modules for example an L3 cache between the L2 cache and the main memory module It is also possible to delete any of the existing modules for example the L2 cache connecting the L1 caches d
4. lt sun_work gt Run the following commands in sequence to perform the conversions 1 csh set path lt path to scripts dir gt Spath 2 mips fe all lt filename gt c 3 mips2expr all lt filename gt The generated files lt filename gt procs and lt filename gt defs will be referred to as benchmarks in Section 2 3 Note that if you do not have a SUN Spare machine and want to compile your own C applications for use with the EXPRESSION framework you can do this compilation online at http www cecs uci edu cgi bin cgiwrap sudeep file_upload cgi by uploading the C application to the server which generates the required files lt filename gt procs and lt filename gt defs for the EXPRESSION toolkit 2 3 EXPRESSION Flow To begin with we take a MIPS R4000 based architecture developed in our ACES laboratory We call this architecture acesMIPS All the subsequent sections will frequently refer to this architecture for the purpose of illustrations The EXPRESSION ADL description of acesMIPS is available in lt work gt acesMIPSD11 bin Example_acesMIPS xmd The complete flow from setting up the framework followed by the loading of acesMIPS architecture in graphical user interface GUI to the evaluation of the architecture consists of following steps in sequence 1 The run directory is lt work gt acesMIPSD11 bin Copy all the benchmarks lt filename gt procs and lt filename gt defs to be run to this directory
5. REG 3 SRC 3 REG 6 should be specified before the rules for mult and addu which are as follows GENERIC IMUL DST 1 REG 1 SRC 1 REG 2 SRC 2 REG 3 TARGET mult DST 1 REG 1 SRC 1 REG 2 SRC 2 REG 3 GENERIC IADD DST 1 REG 1 SRC 1 REG 2 SRC 2 REG 3 TARGET addu DST 1 REG 1 SRC 1 REG 2 SRC 2 REG 3 This will ensure that whenever there is an opportunity to generate a mac operation the compiler will generate it Unlike the operations already present in acesMIPS mac operation has three sources However there are only two input in the ALU units We need to add another read port to each of the ALU units and then bind the operation group containing mac operation to the units You must perform following steps to add the mac operation using the GUI 46 EXPRESSION User Manual 2003 ACES Laboratory 1 Load acesMIPS gmd and acesMIPS isd using Load Graphical Machine Description and Load Instruction Set Description respectively 2 Invoke Set OP_GROUPS from the Instruction Set menu 3 Add mac operation to the ALU_Unit_ops and set its attributes as shown in Fig 30 You need to perform the following steps in sequence to accomplish that e Click ALU Unit ops e Select o to create a NewOp and add to the list of ALU Unit ops operat
6. ALU2 support the same set of single cycle operations As an example let s modify the base architecture to allow a two cycle multiply mult operation on ALU2 and rest of the operations which are all single cycle operations on ALU1 To accomplish this execute in sequence the following steps 1 Load acesMIPS architecture 2 Invoke set OP GROUPS again Click g to add a new operation group Click on the newly added NewGroup and change the name to MultGroup and press the Apply button Operation Groups xi Name MultGroup Op Type gt Bpply Behavior r Operand1 Operand 1 Type Operand2 Operand 2 Type Operand3 Operand 3 Type Operand4 Operand 4 Type E 2 Operand1 Operand 1 Type Operand2 Operand 2 Type Operand3 Operand 3 Type Operand 4 Operand 4 Type r Operand1 Operand 1 Type Operand2 Dperand 2 Type Operand3 Operand 3 Type Operand4 Operand 4 Type Le Fil Blo ASM FORMAT SE IR DUMP FORMAT Figure 38 Adding MultGroup 3 Click OK to save and close the current window 57 EXPRESSION User Manual 2003 ACES Laboratory 4 Select set OP_GROUPS again Select MultGroup from the list of op groups 5 Click o as shown in Fig 38 and add mult operation with the attributes shown in Fig 39 Set the ASM FORMAT as COND dstl reg srcl reg src2 reg PRINT Nt lt opcode gt 1t lt dst1 gt lt srcl g
7. F SRC1 F SRC2 SRC2 F IMUL DEST SRC1 DEST R SRCI R DEST SRCI SRC2 SRC2 SRC2 R IMULU DEST SRC1 DEST R SRCI R DEST SRC1 SRC2 SRC2 SRC2 R Unsigned multiply 79 EXPRESSION User Manual 2003 ACES Laboratory however we do not distinguish between signed and unsigned datatypes pes wi SRCI DEST F_EVEN SRCI DEST SRCI SRC2 SRC2 F_EVEN SRC2 F_EVEN a DEST SRCI DEST F SRCI SRC2 SRC2 F SRC2 SRC2 R SRC2 SRC2 R DDIV DEST SRCI DEST F EVEN SRCI DEST SRC1 SRC2 SRC2 F_EVEN SRC2 F_EVEN FDIV DEST SRCI DEST F SRCI F Py o ae JEE een SRC2 SRC2 SR DEST SRCI DEST CC SRCI DEST SRCI SRC2 SRC2 R SRC2 Sets CC to 01 if srcl src2 IEQU DEST SRCI DEST CC SRCI DEST SRC1 DEQ DEST SRCI DEST CC SRCI DEST SRC1 F_EVEN FEQ DEST SRCI DEST CC SRCI F DEST SRC1 SRC2 SRC2 F SRC2 INE DEST SRC1 DEST CC SRC1 R DEST SRCI SRC2 SRC2 R SRC2 INEU DEST SRCI DEST CC SRCI R DNE DEST SRCI DEST CC SRCI DEST SRCI F_EVEN FNE DEST SRCI DEST CC SRC1 F SRC2 SRC2 F SRC2 ILE DEST SRCI DEST CC SRC1 R SRC2 SRC2 R SRC2 ILEU DEST SRCI DEST CC SRC1 R DLE DEST SRCI DEST CC SRCI DEST SRC1 lt PE seer Pinos Img 80 EXPRESSION User Manual 2003 ACES Laboratory Po EVENS 2 DEST SRCI DEST CC SRCI F DEST SRCI lt Oo Iso seca I DEST SRCI DEST CC SRCI R DEST SRCI
8. associativity of caches Section 5 3 2 Changing sizes of caches memories Section 5 3 3 Adding new memory components in the memory subsystem Section 5 3 4 1 3 Recommended System Configuration The EXPRESSION toolkit has been tested on the following system System OS Name Microsoft Windows XP Professional Version 5 1 2600 Build 2600 System Type X86 based PC Processor x86 Family 15 Model 1 Stepping 2 Genuinelntel 1 Ghz Total Physical Memory 512 00 MB Total Virtual Memory 1 72 GB Page File Space 1 22 GB Development Platform Visual C 6 0 Enterprise Edition 7 EXPRESSION User Manual 2003 ACES Laboratory 1 4 Contact To give comments feedback or report bugs send email to express cecs uci edu 8 EXPRESSION User Manual 2003 ACES Laboratory 2 EXPRESSION Toolkit Setup The current release of EXPRESSION can be downloaded from http www cecs uci edu express There are two main components in EXPRESSION the EXPRESS compiler and the SIMPRESS simulator This tool kit is implemented with Microsoft Visual C 6 0 on an i686 machine running Microsoft Windows XP It has also been tested on Microsoft Windows NT and Windows 2000 A Sparc Solaris 2 7 machine is also required for preprocessing an input application in C using a GCC based front end However this latter step can also be performed from http www cecs uci edu cgi bin cgiwrap sudeep file_upload cgi by uploading the C application to the serv
9. elei 1 See ziel delle icles epee cl led else al o UB Letz Je less gt ph ol N LES a ENE E al Figure 4 acesMIPS architecture on the GUI The screen above shows the different architectural components that can be captured They comprise the following Pipeline Stages called Units Latches between Units Storage components o Register files o Memory modules SRAM amp DRAM o Caches ICache and DCache Ports can be present in units as well as in memory and register files Connections between the ports between storage elements and between units Note Although the compound unit and bus components are present in the GUI they have not been used in the acesMIPS architecture framework and are not guaranteed to work Note also that clicking on a component on the screen and then clicking anywhere else on the screen may sometimes 21 EXPRESSION User Manual 2003 ACES Laboratory cause a ghost a residual image of that component to appear on the screen This is normal and vanishes when you click another component on the screen 4 1 1 Architectural Components Specification In this section we gloss over the details of the aforementioned architectural components 4 1 1 1 Unit Properties Unit E x Name Class Name DECODE DecodeUnit Supported OpCodes Capacity Ge Al fi 2 C Selected Timing BLU Unit ops all 1 FALU_ Unit ops BR Unit o
10. gt Co sre seca RE e SR IGEU DEST SRCI DEST CC SRCI R DEST SRCI gt JI snes BY ecg Jm DGE DEST SRCI DEST CC SRCI DEST SRCI gt SRC2 F_EVEN SRC2 SRC2 DEST SRCI DEST CC SRCI F DEST SRCI gt PO fue fage OT e SE I DEST SRCI DEST CC SRCI R DEST SRCI lt BT snes BS seca ERE mea ES f CC R I ILTU DEST SRC1 DEST CC SRCI DEST SRCI lt SRC2 SRC2 R SRC2 FLE GE FGE LT DLT FLT GT DGT FGT DEST SRC1 DEST CC SRCI DEST SRCI lt SRC2 F_EVEN SRC2 SRC2 F_EVEN DEST SRC1 DEST CC SRC1 F DEST SRCI lt SRC2 SRC2 F SRC2 DEST SRC1 DEST CC SRC1 R DEST SRCI gt SRC2 SRC2 R SRC2 SRC2 SRC2 R SRC2 DEST SRC1 DEST CC SRCI DEST SRCI gt SRC2 F_EVEN SRC2 SRC2 F_EVEN DEST SRC1 DEST CC SRCI F DEST SRCI gt SRC2 SRC2 F SRC2 ILSH DEST SRC1 DEST R SRC1 R DEST SRCI lt lt L SRC2 SRC2 R SRC2 Logical shift operation if src2 is positive shift left else shift right IASH DEST SRC1 DEST R SRC1 R DEST SRCI lt lt A SRC2 SRC2 R SRC2 SRC2 SRC2 R SRC2 SRC2 SRC2 R SRC2 IRLSH DEST SRC1 DEST R SRCI R DEST SRCI gt gt L SRC2 SRC2 R SRC2 o R ILLSH DEST SRC1 DEST R SRCI DEST SRCI lt lt L 81 EXPRESSION User Manual 2003 ACES Laboratory o Sa SRC2 R Sk ILAND DEST SRC1 DEST R SRCI R DEST SRCI amp ILOR DEST SRC1 DEST R SRC1 R DEST SRCI SRC2
11. lt xmd filename gt Specify the input ADL file name for pipelined trailblazing e PreSch Prescheduling Transformations Perform different target independent optimizations o Dead code Elimination o Copy propagation pIList Dump Instruction List on the console pHTG Dump Hierarchical Task Graph on the console pCFG Dump Control Flow Graph on the console pASM Generate assembly code run able on a native machine pDUMP Generate special assembly output IR dump understood by the SIMPRESS simulator e name lt prefix gt Use lt prefix gt to prefix the generated assembly file name as well as the generated IR dump file name SIMPRESS switches supported e sRA Run cycle accurate simulation after Register Allocation e fsRA Run functional simulation after Register Allocation e memCfg mem config Use the memory configuration specified in mem config file To run EXPRESS without any optimization which performs only instruction selection and register allocation and then run SIMPRESS use the following command line options Assume input files are lt filename gt procs and lt filename gt defs 17 EXPRESSION User Manual 2003 ACES Laboratory lt filename gt procs lt filename gt defs pDUMP name lt filename gt ISel RA memCfg mem config sRA To run with different target independent optimizations and Pipelined Trailblazing type the following as command line options lt filename gt
12. procs lt filename gt defs pDUMP name lt filename gt ISel RA EXPR ENAME acesMIPS xmd pipeTbz Tbz PreSch memCfg mem config sRA To run the above optimizations and dump the instruction list the control flow graph and the hierarchical task graph after register allocation use the following command line options lt filename gt procs lt filename gt defs pDUMP name lt filename gt ISel RA EXPR ENAME acesMIPS xmd pipeTbz Tbz PreSch memCfg mem config pIList PCrG pHTG skA gt Please also note that the command line options can be specified in any order 18 EXPRESSION User Manual 2003 ACES Laboratory The VSAT GUI 1 is the front end to the EXPRESSION framework for architectural design space exploration This release focuses on the acesMIPS architecture for exploration We will show in section 5 how the framework can be used to perform architectural exploration This section is broken up into two tutorials The objective of the first tutorial Section 4 1 is to familiarize with the representation of different architectural components and instruction set of acesMIPS architecture The second tutorial Section 4 2 teaches how to add a new component in the GUI The aim of this tutorial is to load the acesMIPS design in GUI and to get familiar with the graphical environment a D lai ziel Z 2 r gjel MEI E eleje el 16 66 655 slrslelel Aal For Help press FL Figure
13. system performance can be determined You must perform following steps to change Associativity using the GUI 1 Load acesMIPS gmd and acesMIPS isd using Load Graphical Machine Description and Load Instruction Set Description respectively 2 Click on the cache component whose Associativity is to be changed 3 Change the value in the Associativity field x Type C Register File C ICache Ge DCache O SRAM C DRAM Name Class Name L2 Storage Word Size Line Size 4 2 Associativity Cache Lines EST a EI Custom Properties Access Time 5 Address Range Mnemonic From o To 9995904 Unda Apply Figure 48 Changing associativity 4 Save Expression Machine description into acesMIPS xmd 67 EXPRESSION User Manual 2003 ACES Laboratory 5 Repeat the steps described in Section 2 3 to evaluate the modified architecture Expected Result Changing the associativity can affect the performance significantly Higher associativity reduces miss rate but also increases hit time Complicated tradeoffs mean that results obtained on changing associativity require careful analysis of the memory subsystem and the application being executed 5 3 3 Changing Sizes Sizes of all the memory subsystem components can be varied by changing the SIZE attribute of the component You must perform following steps to change storage size using the GUI 1 Load acesMIPS gmd and acesMIPS isd using Load Graphical
14. the capture 70 EXPRESSION User Manual 2003 ACES Laboratory 8 Repeat the above procedure to add the datapath between L1 and MainMem On selecting Edit datapaths from the Components menu it should look like the figure below Data Paths xi ISTOR GE_PATHI L1 MainMem FPRFile ALU1_READ FprReadPort6 FprReadPortf LU1ReadPortt 8lulReadPorti FPRFile 4LU1_READ FprReadPort FprheadPort 8LU1ReadPort2Cxn Alul ReadPort2 FPRFile ALUZ READ FprReadPortl FprReadPortlAlu2ReadPort Con Alu2ReadPortl FPRFile ALU2_ READ FprReadPort2 FprReadPort24lu2ReadPort2Cxn Alu2ReadPort2 FPRFile FALU_READ FprReadPort3 FprReadPort3FaluReadPortl Can FaluReadPortl FPRFile FALU_READ FprReadPort4 ForReadPort4FaluReadPort2Cxn FaluReadPort2 FPRFile LDST_READ FprReadPort5 FprReadPortSLdStReadPort3Cxn LdStReadPort3 GPRFile ALUT_READ GprReadPort GprReadPortlAlulReadPortl Cxn Alul ReadPort GPRFile ALU1_READ GprReadPort2 GprReadPort24lul ReadPort2Cen Alul ReadPort2 GPRFile ALU2 READ GprReadPort3 GprReadPort34lu2ReadPortl Can Alu2ReadPortl GPRFile ALU2_READ GprReadPort4 GprReadPort 4lu2ReadPort2Cxn Alu2ReadPort2 GPRFile BR_LREAD GprReadPort5 GprReadPort5BrReadPortl Can BrReadPortl GPRFile BR READ GprReadPort6 GprReadPort6BrReadPort2Cxn BrReadPort2 GPRFile LDST READ GprReadPort GprReadPort LdStReadPortl Con LdStReadPortl GPRFile LDST READ GprReadPort8 GprReadPort8LdStReadPort2Cxn LdStReadPort2 GPRFile LDST READ GprReadPort9 GprReadPortSLdStReadPort3Cxn LdStReadPort3 LD
15. unit An OP_GROUP containing multi cycle operations is linked with a multi cycle functional unit The general process to create a new multi cycle unit as a parallel resource requires the following steps 1 Add new Read unit and Execute units by using add Unit Fig 22 Add a latch Fig 24 to the Read unit and add Connection Fig 25 from the Read Unit latch to Execute Unit il Add new port to the register file and also to the Read unit and establish a connection between the ports Make sure that you specify the proper class names for all of the components ports latches units that you add iii Create anew OP GROUP g and add the multi cycle operation 0 to the operation group Refer to Fig 15 iv Link the newly created OP GROUP with both Read and Execute units Supported opcodes in the unit properties must contain this OP_GROUP V In the timing section of the units specify appropriate number of cycles along with the opcode For example If a multiplier takes two cycles the corresponding Read and execute units will have mult 2 specified in the timing section vi Increase the capacity of the connection by one from WriteBack unit to RegisterFile Adding a new functional unit as a parallel resource is potentially equivalent to adding a new pipeline path 56 EXPRESSION User Manual 2003 ACES Laboratory In the base architecture both ALUI and
16. y gt trunc_w_d m hi we Operand1 Operand 1 Type Operand2 Operand 2 Type mtel z z sgtu Operand3 Operand 3 Type Operand4 Operand 4 Type sleu Po situ zj E y li div ISS mult Operand1 Operand 1 Type Operand2 Operand 2 Type and v L E Operand3 Dperand 3 Type Operand 4 Operand 4 Type von Pa A TEE T H an ASM FORMAT COND dsti reg srcl reg src2 reg PRINT lt opcode gt A det lt sn IR DUMP FORMAT COND dstl reg stc1 reg src2 imm PRINT HMM opcode gt t lt dst1 gt 25 0 o o da Cancel Figure 17 Operand Types for an Operation and The various fields are described below Name Name of opcode Op Type Type of opcode can be either data control or flow Behavior Describes the behavior of the opcode 34 EXPRESSION User Manual O 2003 ACES Laboratory Operand X Specifies operands in the opcode Operand X Type Specifies type of the operand These types were defined in the VAR_GROUPs section ASM format Specifies format for standard assembly dump enabled by option pASM IR Dump Format Specifies format for intermediate representation dump which acts as assembly for the simulator The simulator expects the instruction format to be in the following format lt opcode gt dst1 dst2 srel src2 Clicking on o deletes the opcode 4 1 2 3 set OPERAND_MAPPING OPERAND MAPPING xj RAND MA
17. 12 EXPRESSION User Manual O 2003 ACES Laboratory 13 2 Invoke Microsoft Visual C of Microsoft Visual Studio 6 0 and open the workspace lt work gt acesMIPS dsw In the FileView following projects should appear in this workspace acesMIPS Base Class Lib acesMIPS Build System Lib 4 acesMIPS Derived Class Lib e acesMIPS Simulator Functions Lib acesMIPSConsole acesMIPSDII a acesMIPSfuncSimulator 4 expression console expression dll graphViz pcProGUI 3 Select acesMIPSConsole from the Workspace window and press ALT F7 This invokes the Settings window for acesMIPSConsole project shown in Fig 1 Make sure you have the following Settings for the projects a In the Link tab of the settings window for the General Category the Output file name of the projects should be set as follows i acesMIPS console acesMIPSdll bin acesMIPSconsole exe ii acesMIPSdll acesMIPSd11 bin acesMIPSdll dll ill expression console acesMIPSd11 bin expression console exe iv expression dll acesMIPSdll bin expression dll dll v graphViz acesMIPSD11 bin graphVviz dll vi pcProGUI acesMIPSd11 bin pcProGUI exe b In the Debug tab for the General Category the Working Directory for the projects acesMIPS console expression console and pcProGUI should all be set to the run directory lt work gt acesMIPSD11 bin
18. 2 EXPRESSION GUI This is the first screen you will see when you run the GUI pcProGUI exe Go to the File menu and select New or click on the New icon on the toolbar 19 EXPRESSION User Manual 2003 ACES Laboratory Next go to the Architecture menu option and select New You should now see the following screen EXPRESS SIM View File Components Instruction Set Simulator View Window Help pem E R Ki SIS init o lolx El ifef njanje zi For Help press F1 Figure 3 Architecture Entry View Now from the File menu choose the option to Load a Graphical Machine Description Select and open graphical machine description file lt run gt acesMIPS gmd This graphical machine description file contains a layout of all the components including pipeline stages architecture units register files and the memory subsystem in acesMIPS architecture Then from the Instruction Set menu select the option to Load an Instruction Set Description Select and open instruction set description file lt run gt acesMIPS isd This instruction set description file contains the description of acesMIPS instruction set Clicking on any entity on the screen will bring up its properties in the Properties window The Properties window will be overlaid on the main window 20 EXPRESSION User Manual 2003 ACES Laboratory E ies isix File Components Instruction Set Simul View Window Help Is jem
19. Apply Properties Port xf Name Class Name Alu2ReadPort3 UnitPort Port Type Ze Read O write C Read write Custom Properties ARGUMENT SOURCE 3 CAPACITY 1 Figure 33 Add new port to the ALU2 READ 50 EXPRESSION User Manual O 2003 ACES Laboratory 7 Add connection between newly added port of ALU2_READ and that of GPRFile by following the steps below Select Components gt Add Connection Click on the newly added ports one after another Set the attributes for the connection as shown in Fig 34 Click Apply 8 Add datapath by following the steps explained in Section 4 2 7 corresponding to the connection between ALU2_READ and GPRFile E Name Class Name E prReadPorti 04 lu2ReadPort3 RegisterConnection Custom Properties Figure 34 Add Connection between the new ports 9 So far we have added a capability to ALU2_READ unit to accept ALU_Unit_ops operations includes mac operation having three source operands Now repeat steps 5 through 9 to add the same capability to ALU1_READ unit Keep in mind to keep the names of ports and connection different from the names shown in Fig 32 Fig 33 and Fig 34 10 Save Expression Machine description into aces MIPS xmd 51 EXPRESSION User Manual O 2003 ACES Laboratory 11 Repeat the steps described in Section 2 3 to evaluate the modified architecture Expected result You should be able to notice that the generated c
20. EXPRESSION User Manual Version 1 0 05 28 2003 Authors Partha Biswas Sudeep Pasricha Prabhat Mishra A viral Shrivastava Nikil Dutt and Alex Nicolau partha sudeep pmishra aviral dutt nicolau cecs uci edu http www cecs uci edu aces ACES Laboratory Center for Embedded Computer Systems School of Information and Computer Science University of California Irvine 1 EXPRESSION User Manual 2003 ACES Laboratory EXPRESSION User Manual 2003 ACES Laboratory TABLE OF CONTENTS 1 INTRODUCTION SE 5 2 EXPRESSION TOOLKIT SETUP euros SE eg nement anal nest 9 COMMAND LINE OPTIONS sa c d cen baud enter Seti Seda eee ne 17 a ARCHITECTURE ENTRY EE EE 19 5 DESIGN EE EENS Duden 44 O BENCHMARKS ur Ee e 12 7 OPEN ISSUES AND FUTURE DIRECTIONS eee eee 79 REENEN enter 74 APPENDIX Av EXPRESSION ADE id aca add ada tam area 75 APPENDIX B GENERIC MACHINE MODEL ooooococcoononnnncncnncnononnnnonononocnoncnnanonanoncnnos 78 EXPRESSION User Manual 2003 ACES Laboratory Acknowledgements We would like to thank and acknowledge the contributions of several former and current members of the ACES Lab in CECS who helped make the EXPRESSION project a reality EXPRESSION would not have been possible without the invaluable contributions of the following people Ashok Halambi Peter Grun Asheesh Khare Nick Savoiu Radu Cornea Srikanth Srinivasan and Vijay Ganesh We are also grateful to all the members of the ACES
21. G GENERIC DATATYPE INT CLASSTYPE MEM TARGET iri sas ME cm Figure 18 Register Class mappings for Operands 35 EXPRESSION User Manual 2003 ACES Laboratory The registers in the generic machine are classified into a set of register classes based on types like GPRFile registers FPRFile registers return address register register hard wired to zero etc or var_groups already discussed in Section 4 1 2 1 The mappings of these generic register classes to a new Set of target register classes are specified in this section 4 1 2 4 set TREE_MAPPING TREE MAPPING Generic gt Target Opcode Mapping j x Press CNTRL Enter for newline after typing text TREE MAPPING GENERIC IADD DST 1 REG 1 SRC 1 REG 2 SRC 2 IMM 3 TARGET addu DST 1 REG 1 SRC 1 REG 2 SRC 2 IMM 3 Figure 19 Tree Mapping This section is used to specify the Tree Mapping which is a mapping from generic to target opcodes This is the section that is used by the Instruction Selection phase to convert the generic operations into the target operations 4 1 2 5 Set Instruction Description This section is used to specify the operation slots in a VLIW instruction In the acesMIPS example we have 4 slots for data operations 2 ALU operations 1 FALU operation 1 LDST operation and 1 slot for Control operation A valid VLIW instruction of word length 32 comprises of any four out of these slots 36 EXPRESS
22. GprReadPort4 GprReadPort44lu2ReadPort2Cxn Alu2ReadPort2 GPRFile BR READ GprReadPort5 GprReadPort5BrReadPortl Can BrReadPortl GPRFile BR READ GprReadPort6 GprReadPort6BrReadPort2Cxn BrReadPort2 GPRFile LDST_READ GprReadPort GprReadPort LdStReadPort Can LdStReadPortl GPRFile LDST_READ GprReadPort8 GprReadPort8LdStReadPort2Cxn LdStReadPort2 GPRFile LDST READ GprReadPort9 GprReadPortSLdStReadPort3Cxn LdStReadPort3 LOST EX L1 LdStReadwritePort LdStMemCxn L1Readw ritePort WEB FPRFile WwbwritePortwbwrtePortFprW titePortCxn FprwritePort WEB GPRFile WbwritePort WbWritePortG prwritePortCxn GprwritePort Remove All Remove Cancel Figure 21 Adding datapath 4 2 Tutorial II The aim of this tutorial is to show you how to add various components in the architecture To add various components in the architecture all that are required is a click on a button on the appropriate toolbar to select the component to be added and another click on the screen to place the component You can drag the component to place it anywhere on the screen or right click on it to resize it To delete a component just click on it on the screen and press the DELETE key on your keyboard Once the component is placed on the screen clicking on it displays its properties in the Properties window which can then be updated 38 EXPRESSION User Manual 2003 ACES Laboratory 4 2 1 Adding Unit Figure 22 Add Unit To add a new unit click on the Add Unit button o
23. INT SKopcode gt t IR DUMP FORMAT COND dstl seg srclsteg srcZeteg PRINT t4 t lt opcode gt t lt dst gt COND dstl reg srcl 1eg sre2 imm PRINT t4 t lt opcode 25 do 35 la en Figure 39 Add mult operation 9 Click ALU2_READ box and set the attributes shown in Fig 40 Then click ALU2_EX box and set the parameters shown in Fig 41 Select MultGroup to be the operation group supported by both ALU2_READ and ALU2_EX units Set the Timing to mult 2 to indicate that mult is a 2 cycle operation 59 EXPRESSION User Manual 2003 ACES Laboratory Properties Unit E Name Class Name aLUZ R EAD OpReadU nit Supported OpCodes Capacity 1 Timing mult 2 Instruction In d Instruction Out Custom Properties fi Figure 40 ALU2_READ parameters Properties Unit El Name Class Name ug ExecuteUnt Supported OpCodes Capacity 1 Timing mult 2 Instruction In a Instruction Out Custom Properties 1 ARGUMENT UNIT Unde Apply Figure 41 ALU2_EX parameters 10 Save EXPRESSION description and evaluate the changes done to the architecture 60 EXPRESSION User Manual 2003 ACES Laboratory Expected Result A degraded performance owing to increase in mult latency and decrease in the number of resources for all operations 5 2 2 Adding a New Pipelined Functional Unit Adding a new pipeline path helps incr
24. ION User Manual 2003 ACES Laboratory Instruction Description Word Length 32 y Instruction Slots Type Bitwidth Unit ALU1_EX ALU2 EX PT EX Bitwidth Unit TE Cancel Io Figure 20 VLIW Instruction Template 4 1 2 6 Edit datapaths This section is used to specify the various data paths in the architecture between units and storage elements as well as paths between storage elements e g L1 and L2 caches From the Component menu select Edit datapaths You can see in the dialog above the various datapaths between units and storage elements like register files and memories Data paths between storage elements are shown prefixed with a IISTORAGE_PATHII specifier 37 EXPRESSION User Manual 2003 ACES Laboratory ISTORAGE_PATHI IL1 L2 ISTORAGE_PATHII L1 L2 ISTORAGE_PATHI L2 MainMem FPRFile ALU2 READ FprReadPortl FprReadPortl4lu2ReadPortl Cxn Alu2ReadPortl FPRFile ALUZ READ FprReadPort2 FprReadPort24lu2ReadPort2Cxn Alu2ReadPort2 FPRFile F LU READ FprReadPort3 FprReadPort3FaluReadPortl Can FaluReadPortl FPRFile FALU_READ FprReadPort4 FprReadPort4FaluReadPort2Cxn FaluReadPort2 FPRFile LDST READ FprReadPort5 FprReadPort5LdStReadPort3Cxn LdStReadPort3 GPRFile ALU1_READ GprReadPortl GprReadPort4lulReadPortl Can Ak ReadPort1 GPRFile ALU1_READ GprReadPort2 GprReadPort24lu1ReadPort2Cxn Alul ReadPort2 GPRFile ALU2 READ GprReadPort3 GprReadPort34lu2ReadPortt Con Alu2ReadPortl GPRFile ALU2 READ
25. Lab who took time out of their busy schedules to test and give feedback on the release which helped us immensely 4 EXPRESSION User Manual 2003 ACES Laboratory 1 Introduction EXPRESSION is an Architecture Description Language ADL as well as a retargetable compiler simulator tool kit for architectural design space exploration DSE A processor architecture can be captured using the Graphical User Interface GUI The front end of the tool kit generates the EXPRESSION description for the processor which in turn steers automatic generation of retargetable compiler and simulator The key features of our design methodology include e Ease of specification and modification of architecture from the GUI e Mixed behavioral structural representation supporting a natural concise specification of the architecture e Explicit specification of the memory subsystem allowing novel memory organizations and hierarchies e Efficient specification of architectural resource constraints allowing extraction of detailed Reservation Tables RTs for compiler scheduling This document will serve as a manual for users involved in rapid exploration of programmable embedded systems 1 1 Organization of User Manual This user manual is organized as follows Section 2 explains how to set up the EXPRESSION framework Section 3 describes different command line options available for running different components of EXPRESSION Section 4 discusses the who
26. May 2001 7 P Mishra F Rousseau N Dutt and A Nicolau Architecture Description Language driven Design Space Exploration in the Presence of CoProcessors SASIMI October 2001 8 P Mishra P Grun N Dutt and A Nicolau Processor Memory Co Exploration driven by a Memory Aware Architecture Description Language VLSI Design January 2001 9 P Mishra N Dutt and A Nicolau Functional Abstraction driven Design Space Exploration of Heterogeneous Programmable Architectures ISSS October 2001 10 S Pasricha P Biswas P Mishra A Shrivastava A Mandal N Dutt A Nicolau A Framework for GUI driven Design Space Exploration of a MIPS4K like Processor CECS Technical Report 03 17 April 2003 74 EXPRESSION User Manual 2003 ACES Laboratory Appendix A EXPRESSION ADL EXPRESSION employs a simple LISP like syntax to ease specification and enhance readability An EXPRESSION description is composed of two main sections Behavior or IS and Structure The Behavior section is further sub divided into Operations Instruction and Operation Mappings sections The Structure section is sub divided into Components Pipeline Data Transfer Paths and Memory Subsystem sections A 1 Operations This subsection describes the IS of the processor The IS is organized into operation groups with each group containing a set of operations having some common characteristics Each operation is then described in terms of its opcode operands
27. ND dstl reg srcl reg stc2 1eg sic3 reg PRINT St4 t lt opcode gt AH IR DUMP FORMAT COND dstl reg srcl 1eg src2 reg src3 1eg PRINT t4 t lt opcode gt A jal 4 Co Figure 30 mac operation 48 EXPRESSION User Manual O 2003 ACES Laboratory TREE MAPPING Generic gt Target Opcode Mapping Press CNTRL Enter for newline after typing text Set GENERIC i IMUL DST 1 REG 1 SRC 1 REG 2 SRC 2 REG 3 MFLO DST 1 REG 4 SRC 1 REG 1 IADD DST 1 REG 5 SRC 1 REG 6 SRC 2 REG 4 H TARGET i mac DST 1 REG 5 SRC 1 REG 2 SRC 2 REG 3 SRC 3 REG 6 sel t GENERIC IADD DST 1 REG 1 SRC 1 REG 2 SRC 2 REG 3 x cm Figure 31 Rule for mac operation 5 Do the following to add a new port to the GPR register file o Select Components gt Add Port and click anywhere inside the GPRFile box o Click on the port and set the attributes shown in Fig 32 o Click Apply 49 EXPRESSION User Manual O 2003 ACES Laboratory Properties Port xf Name Class Name Port Type Ze Read Write C Read Write Custom Properties CAPACITY 1 Figure 32 Add new GPR port 6 Now add a new port to ALU2_READ unit To do so execute the following steps o Select Components gt Add Port and click anywhere inside the ALU2 READ box Click on the newly added port and set the attributes shown in Fig 33 o Click
28. PPING OP_MAPPING GENERIC DATATYPE CLASSTYPE NORMAL TARGET int normal OP MAPPING GENERIC DATATYPE INT CLASSTYPE IMM TARGET int ix OP MAPPING GENERIC DATATYPE INT CLASSTYPE NORMAL TARGET int normal OP MAPPING GENERIC DATATYPE INT CLASSTYPE ANY TARGET int_ar OP MAPPING GENERIC DATATYPE INT CLASSTYPE CALL PARM TARGET int call rt OP MAPPING GENERIC DATATYPE INT CLASSTYPE ZERO TARGET int zero OP MAPPING GENERIC DATATYPE INT CLASSTYPE CC TARGET int Gei OP MAPPING GENERIC DATATYPE INT CLASSTYPE SP TARGET int sp OP MAPPING GENERIC DATATYPE INT CLASSTYPE FP TARGET int p OP MAPPING GENERIC DATATYPE INT CLASSTYPE PC TARGET int pc OP MAPPING GENERIC DATATYPE INT CLASSTYPE RET VAL TARGET int_retval OP MAPPING GENERIC DATATYPE INT CLASSTYPE RET_ADDR TARGET De int retadc OP MAPPING GENERIC DATATYPE INT CLASSTYPE HILO TARGET int hilo OP MAPPING GENERIC DATATYPE INT CLASSTYPE RISAZ TARGET int rISAZ OP MAPPING GENERIC DATATYPE INT CLASSTYPE RI5A4 TARGET int_rI5a4 OP MAPPING GENERIC DATATYPE INT CLASSTYPE RISAS TARGET int rISAS OP MAPPING GENERIC DATATYPE INT CLASSTYPE RISAL6 TARGET int_rI5ale OP MAPPING GENERIC DATATYPE INT CLASSTYPE RISA TARGET int rISA OP MAPPING GENERIC DATATYPE INT CLASSTYPE TEMP RISA TARGET int temp r OP MAPPING GENERIC DATATYPE INT CLASSTYPE ANY TARGET int ar OP MAPPIN
29. RC3 PT sex O fim ecra LAB LAB GOTO SRCI SRCI LAB PC LAB IGOTO SRCI SRCI LAB PC LAB CALL SRCI SRC2 SRCI LAB SRC2 R RA PC PC LAB PARAM LIST First parameter is src2 rest parameters have to be passed in explicit parameter list CVTDI DEST SRCI DEST F_EVEN SRCI DEST Double R SRCI Convert Integer to Double CVTID DEST SRC1 DEST R SRCI DEST Integer F_EVEN SRCI CVTSI DEST SRCI DEST F SRCI DEST Float SRCI DMTCI DEST SRCI DEST F SRCI DEST SRCI Move a value of a register in R to a register in F CVTSD DEST SRCI DEST F SRCI DEST Float SRCI F_EVEN CVTDS DEST SRC1 DEST F_EVEN SRCI DEST Double F SRCI MTCI DEST SRCI DEST F SRCI DEST SRCI Move a value of a register in R to a register in F MECI DEST SRCI DEST R SRCI DEST SRCI Move a value of a register in F to a register in R TRUNCID DEST SRCI DEST R SRCI F_EVEN DEST SRCI Truncate a Double to make a Integer DEST SRCI R R DMFC DEST SRCI DEST R SRCI F DEST SRCI Move a value of a register in F to a register in R R F F TRUNCIS DEST SRC1 DEST R SRCI 83 EXPRESSION User Manual 2003 ACES Laboratory IABS DEST SRC1 DEST R SRCI R DEST abs SRC1 DABS DEST SRCI DEST F_EVEN SRCI DEST abs SRC1 F_EVEN SQRT DEST SRCI DEST R SRCI R DEST sqrt SRC1 DEST SRC1 DEST R SRC1 R DEST exp SRC1 Table 1 The ISA of Gen
30. READ GprReadPort5 GprReadPort5BrReadPortl Can BrReadPort1 GPRFile BR READ GprReadPort6 GprReadPort6BrReadPort2Cxn BrReadPort2 GPRFile LDST READ GprReadPort GprReadPort LdStReadPortl Con LdStReadPortl GPRFile LDST READ GprReadPort8 GprReadPort8LdStReadPort2Cxn LdStReadPort2 GPRFile LDST READ GprReadPort9 GprReadPortSLdStReadPort3Cxn LdStReadPort3 LDST_EX L1 LdStReadwritePort LdStMemCxn L1ReadwritePort WB FPRFile WbwritePort Wb ritePortF prwritePortCxn FprwritePort WB GPRFile WbwritePort WbWritePortG prwritePortCxn GprwritePort Remove All Remove Cancel Figure 50 Removing storage paths 4 Click OK to commit changes 5 Now we need to add storage paths between IL1 and MainMem and L1 and MainMem First click on the button to add a connection component from the toolbar on the left or select Add Connection from the Components menu Click inside the IL1 component and then click inside the MainMem component on the screen A connection component will be added between the IL1 and the MainMem component 6 Repeat the procedure to add a connection between L1 and MainMem 7 Now we need to add the storage paths Click on the button to add datapath from the toolbar on the left or select Add Datapath from the Components menu Click inside the IL1 component Next click on the connection component connecting IL to MainMem Finally click inside the MainMem component Right click anywhere on the screen to finish
31. READ and ALU2_EX 3 After deleting the components it is also necessary to delete the datapaths This can be done as follows e Invoke Components gt Edit Datapaths e Select each datapath going through ALU2_READ and remove it by Clicking Remove Fig 46 shows a datapath to be removed e Click OK in the end 64 EXPRESSION User Manual 2003 ACES Laboratory Data Paths x ISTORAGE PATHI IL1 L2 ISTORAGE PATHI L1 L2 ISTORAGE PATHI L2 MainMem FPRFile ALU1_READ ForReadPort6 FprReadPort64LU1ReadPortl amp lu1ReadPort1 FPRFile ALU1_READ FprReadPort FprReadPort 4LU1 ReadPort2Cxn Ak ReadPort2 FPRFile ALU2 READ E prReadPorti Alu2ReadPortt Cxn Slu2ReadPortl FPRFile ALU2 READ Fprhe ort2 ForReadPort24lu2ReadPort2Cxn Alu2ReadPort2 FPRFile FALL READ ForReadPort3 FprReadPort3FaluReadPortl Cxn FaluReadPortl FPRFile FALU_READ FprReadPort FprReadPort4FaluReadPort2Cxn FaluReadPort2 FPRFile LOST READ FprReadPort5 ForReadPorthLdStReadPort3Cxn LdStReadPort3 GPRFile ALU1_READ GprReadPortl GprReadPortl4lulReadPortl Con AlulReadPortl GPRFile ALU1_READ GprReadPort2 GprReadPort24lu1ReadPort2Cxn Alul ReadPort2 GPRFile 4LU2 READ GprReadPort3 GprReadPort34lu2ReadPortl Can Alu2ReadPortl GPRFile 4LU2 READ GprReadPort4 GprReadPort44lu2ReadPort2Cxn Alu2ReadPort2 GPRFile BR READ GprReadPort5 GprReadPortBBrReadPortl Can BrReadPortl GPRFile BR READ GprReadPort6 GprReadPort6BrReadPort2Cxn BrReadPort2 GPRFile LDST READ GprReadPort GprReadPort LdStReadPort
32. ST_EX L1 LdStReadwritePort LdStMemCxn L1ReadwritePort WB FPRFile WbwritePort Wb ritePortFprwritePortCxn FprwritePort WB GPRFile WbwritePort WbWritePortGprwritePortCxn GprwritePort Remove All Remove Cancel Figure 51 Adding new storage paths 9 Save Expression Machine description into acesMIPS xmd 10 Repeat the steps described in Section 2 3 to evaluate the modified architecture Expected Result You can see a marked depreciation of performance on removing the L2 cache This does not imply however that adding a new level of cache to the original memory subsystem configuration is guaranteed to improve performance 71 EXPRESSION User Manual 2003 ACES Laboratory 6 Benchmarks The benchmarks that can be used for testing the EXPRESSION framework comprise the following e Livermore Loops Benchmarks LLs e Multimedia kernels Benchmarks MMs For each benchmark the IR dump is generated by EXPRESS and SIMPRESS gives the number of cycles of running the generated code on acesMIPS architecture For lt filename gt c the generated IR dump is stored in lt filename gt _DUMP_IR_AFTER_REGALLOC txt 72 EXPRESSION User Manual 2003 ACES Laboratory 7 Open Issues and Future Directions This is the first release of EXPRESSION and there are some limitations in the usage of the tool set We enumerate the limitations observed so far as follows e In this release the applications having function calls are not support
33. YPE CLASSTYPE IMM TARGET int immediate GENERIC DATATYPE INT CLASSTYPE NORMAL TARGET int normal GENERIC DATATYPE INT CLASSTYPE ANY TARGET int any GENERIC DATATYPE INT CLASSTYPE CALL PARM TARGET int call param GENERIC DATATYPE INT CLASSTYPE ZERO TARGET int zero GENERIC DATATYPE INT CLASSTYPE CC TARGET int cc GENERIC DATATYPE INT CLASSTYPE 5P TARGET int_sp GENERIC DATATYPE INT CLASSTYPE FP TARGET int p GENERIC DATATYPE INT CLASSTYPE PC TARGET int_pc GENERIC DATATYPE CLASSTYPE DEI VAL TARGET int retval GENERIC DATATYPE INT CLASSTYPE RET ADDR TARGET int retaddr GENERIC DATATYPE INT CLASSTYPE MEM TARGET int_mem GENERIC DATATYPE DOUBLE CLASSTYPE IMM TARGET double immediate GENERIC DATATYPE DOUBLE CLASSTYPE DOUBLE1 TARGET doublel normal GENERIC DATATYPE DOUBLE CLASSTYPE DOUBLEZ TARGET double2 normal GENERIC DATATYPE DOUBLE CLASSTYPE DOUBLE TARGET double normal j GENERIC DATATYPE DOUBLE CLASSTYPE ANY TARGET double any GENERIC DATATYPE DOUBLE CLASSTYPE RET VAL TARGET doublel retval GENERIC DATATYPE DOUBLE CLASSTYPE RET VAL TARGET doubleZ retval we gt es Figure 36 operand mapping to int odd 4 Now change the register accessibility of destination operand of mult and source operand of mflo and mfhi from int hilo to int odd This can be done as follows
34. a new pipeline stage and add a dummy stage with the attributes shown in Fig 44 The modified architecture would resemble Fig 43 Properties Unit WS 2 Name Class Name BLUZ 52 SimpleStageUnit Supported OpCodes Capacity CAN 1 rte Selected Timing mult 1 Instruction In i Instruction Out Custom Properties fi ARGUMENT UNIT Figure 44 Dummy stage 3 Save EXPRESSION description and evaluate the architectural modification Expected Result Pipelining a multi cycle operation should enhance the performance 5 2 3 Deleting a Pipeline Path Often times there are more resources in the architecture than what an application requires If we are designing an architecture suitable for a given application we need to remove resources unutilized by the application We show using an example how to delete a pipeline path Starting with acesMIPS as the base architecture we will delete the pipeline path through ALU2 Here are the steps 1 Load acesMIPS architecture 2 First select ALUZ READ and click Components gt Delete to delete it Now select ALU2 EX and delete it You will find that all the connections to and from the units are also deleted automatically Then 63 EXPRESSION User Manual O 2003 ACES Laboratory delete the unused latch in the Decode unit The architecture devoid of ALU unit is shown in Fig 45 E E E El El El E El o gadri Figure 45 After deletion of ALU2_
35. ache C DCache C SRAM C DRAM Name Class Name IG Storage Word Size Line Size 4 2 Associativity Cache Lines 8 H Custom Properties Access Time 1 Address Range Mnemonic From o To 53998 Urda Apply Figure 11 Cache Properties A cache storage element Cache or DCache is characterized by properties that are displayed in the Properties window that is displayed when the storage element is clicked on the screen The fields shown above are described below Name Name of cache Class Name Class of the cache Can only be Storage 28 EXPRESSION User Manual 2003 ACES Laboratory Word Size Number of bytes in a word Line Size Number of words in a line Associativity Associativity level of cache Cache lines Number of lines in cache Access Time Time to access cache in cycles Address Range Range of addresses associated with cache Custom Properties Other miscellaneous properties 4 1 1 7 Storage RAM a Type Register File O Cache DCache C SRAM DRAM Name Class Name Machen Storage width Size Bo fo send Statu Bache Lines e 0 o Custom Properties Access Time 50 Address Range Mnemonic From fo To 9995904 Unda Apply Figure 12 Main Memory Properties A RAM DRAM or SRAM storage element is characterized by properties that are displayed in the Properties window that is displayed when the storage element is clicked on the screen The fields shown above a
36. and behavior Each operand is classified either as source or as destination Further each operand has an associated list of register files to which it can be bound These lists are specified in the VAR GROUPS subsection A 2 Instruction This subsection captures the parallelism available in the architecture An Instruction is viewed as containing operations that can be executed in parallel Each Instruction contains a list of slots to be filled with operations with each slot corresponding to a Functional Unit A 3 Operation Mappings In this subsection the user specifies information needed by Instruction Selection and architecture specific optimizations of the compiler Each entry in this subsection represents the mapping of a sequence of operation to another sequence of operations The mapping can be from generic compiler operations to target processor operations in which case it is used by the instruction selection algorithm or from target operations to target operations to be used as architecture dependent optimizations The instruction selection algorithm uses a tree parsing technique utilizing dynamic programming 75 EXPRESSION User Manual 2003 ACES Laboratory A 4 Components This subsection describes each RT level component in the architecture The components can be any of Pipeline units Functional units Storage elements Ports and Connections Each component also has a list of attributes optional The attributes
37. anged instruction set can be saved for future reference by invoking Save Instruction Set Description as shown in Fig 13 We discuss the different options in the Instruction Set menu in the following sub sections 4 1 2 1 set VAR_GROUPS Go to the Instruction Set menu and select the set VAR_GROUPS option x Name Datatype Components separated any_call_param INT GPRFile 4 12 any CC INT CC any_fp INT FP any_hilo INT HILO any pc INT PC any retaddr INT GPRFile 31 any sp INT SP double_all DOUBLE FPRFile IMM double_any DOUBLE FPRFile 0 2 4 6 810 12 14 16 18 20 22 24 26 28 30 double_immediate DOUBLE IMM double1_normal DOUBLE FPRFile 0 2468101214 16 18 20 22 24 26 28 30 double1_retwal DOUBLE FPRFile 0 double2_normal DOUBLE FPRFile 1 357911131517 1921 232527 2931 double retval DOUBLE FPRFile 1 float_all FLOAT FPRFile IMM float_any FLOAT FPRFile float immediate FLOAT IMM float_normal FLOAT FPRFile xl Name Datatype Components separated Figure 14 Setting VAR_GROUPS The target registers are classified into new var_groups or register classes based on their data types and mappings with the var_groups in generic register files For example the var_group int_hilo refers to the register holding the output of a multiplication The var group int fp is used to capture the register used as frame pointer This section allows you to specify the var groups which are used later when specifying the allo
38. can be any of the following SUBCOMPONENTS If the component is a compound component specifies the list of subcomponents LATCHES If the component is a unit specifies the list of latches that the unit is attached to PORTS The list of ports attached to this component CONNECTIONS The list of connections attached to this component OPCODES The list of opcode groups that this component accepts Note If this attribute is all then it means that the component does not make a distinction between opcodes TIMING For multi cycle or pipelined units specifies the timing behavior Timing can be specified on a per opcode basis if necessary CAPACITY The number of operations that can be accepted by this component in a single cycle The default is a single operation per cycle A 5 Pipeline and Data Transfer Paths This subsection describes the net list of the processor The pipeline description provides a mechanism to specify the units that comprise the pipeline stages while the data transfer paths description provides a mechanism for specifying the valid data transfers This information is used to both retarget the simulator and to generate reservation tables needed by the scheduler The pipeline paths and the data transfers represent the structural net list information for the architecture They can be generated automatically from a schematic capture tool When writing generating this information false paths may be present in the archit
39. dified by varying the ACCESS_TIMES attribute of the component You must perform following steps to change Access Times using the GUI 1 Load acesMIPS gmd and acesMIPS isd using Load Graphical Machine Description and Load Instruction Set Description respectively 2 Click on the storage component whose Access Time is to be changed 3 Change the value in the Access Time field Properties Storage x Type C Register File C ICache DCache C SRAM C DRAM Name Class Name L2 Storage Word Size Line Size 4 E Associativity Cache Lines ol 64 z Custom Properties Access Time 5 y Address Range Mnemonic From fo To 9995904 Unda Apply Figure 47 Changing access time 4 Save Expression Machine description into acesMIPS xmd 5 Repeat the steps described in Section 2 3 to evaluate the modified architecture 66 EXPRESSION User Manual 2003 ACES Laboratory Expected Result In most cases decreasing the access time improves performance However there might not be a significant change if the storage component is not accessed by the application Still the performance cannot deteriorate if access time is decreased 5 3 2 Changing Associativity Associativity in caches is an important parameter that can affect miss rate and hit time Greater associativity can come at the cost of increased hit time By varying the ASSOCIATIVITY attribute of caches the impact of this parameter on
40. e Instruction Set menu Set the Datatype to INT and Components to GPRFile 1 3 5 79 11 13 15 17 19 21 23 25 27 29 The following snapshot clearly shows the portion to be added 52 EXPRESSION User Manual 2003 ACES Laboratory YAR_GROUPS al Name Datatype Components separated any_call_param INT GPRFile 4 1 2 any CC INT CC any fp INT FP any_hilo INT HILO any DC INT PC any_retaddr INT GPRFile 31 any sp INT SP double_all DOUBLE FPRFile IMM double_any DOUBLE FPRFile 0 2 46810 121416 18 20 22 24 26 28 30 double_immediate DOUBLE IMM double1_normal DOUBLE FPRFile 02 4 6 810 12 14 16 18 20 22 24 26 28 30 double retval DOUBLE FPRFile 0 double2_normal DOUBLE FPRFile 1 35 7911131517 1921 23 25 27 2931 double retval DOUBLE FPRFile 1 float all FLOAT FPRFile IMM float any FLOAT FPRFile float_immediate FLOAT IMM float_normal FLOAT FPRFile xl Components separated odd INT GPRFie 1 357911131517192 Cancel Figure 35 New register class int odd Press to add to the list and click OK 3 After successfully adding int odd to the set of target register classes specify mapping of generic HILO class to int odd instead of int hilo To do so choose Instruction Set gt set OPERAND MAPPING Replace int hilo by int odd The changed window is shown as follows 53 EXPRESSION User Manual 2003 ACES Laboratory OPERAND MAPPING e xi GENERIC DATAT
41. e Select set OP GROUPS from the Instruction Set menu e Double click on ALU Unit ops to list the operations in this operation group e Select mult from the list and change the value of Operand 3 Type to int odd The changed window for mult is shown in Fig 33 Click Apply to commit changes e Similarly select successively mflo and mfhi operations Change the value of Operand 1 Type for each operation to int_odd Click Apply to commit changes e Click OK to commit all the changes and close the window 54 EXPRESSION User Manual 2003 ACES Laboratory Operation Groups A E LU Unit ops dmfct ST Name mult Op Type DATA y pe duc Behavior DEST SOURCE 1 SOURCE 2 cvt_s_w sor Av and Operandi Operand 1 Type Operand2 Operand 2 Type aCe SRC 1 gt fint_any sac y int any evtd w Operand 3 Operand 3 Type Operand4 Operand 4 Type trunc_w_s DST x trunc_w_d m hi T a fl WS Operand1 Operand 1 Type Operand 2 Operand 2 Type mtel y y sgtu Operand3 Operand 3 Type Operand4 Operand 4 Type sleu Po a Po A situ s 6 4 li E div Ti Operand1 Operand 1 Type Operand 2 Dperand 2 Type and gt zl Ez a Operand 3 Operand 3 Type Operand 4 Operand 4 Type von e ETA Ta A D Ka orn J ASM FORMAT COND dstl reg stcl reg ste2 reg PRINT St lt opcode gt t lt dst1 gt COND dsti reg srcl reg src2 i
42. e the connection specifies a link between a port of a unit and a port of a storage element like a register file or memory 3 a storage element and another storage element In this case we are adding a storage connection i e connections between storage elements of the memory subsystem 4 2 6 Adding Pipeline stage Add pipeline stage Figure 27 Add Pipeline stage 41 EXPRESSION User Manual 2003 ACES Laboratory To add a pipeline stage just click on the appropriate button and click on the screen 4 2 7 Adding Datapath add datapath Figure 28 Add datapath Datapaths are added between units and storage elements or between two storage elements Whenever new ports and connections are added datapaths must also be added explicitly To add a data path between a storage element and a unit click on the appropriate button to add datapath Fig 28 Click on the storage element make sure not to click on a port within the storage element Next click on the unit make sure not to click on a port or a latch within the unit Next click on a port in the storage element Next click on the connection from that port to the port in the unit you clicked on earlier Finally click on the port in that unit to which the connection is attached To finish adding the datapath right click anywhere on the screen make sure not to click on any component on the screen The datapath has now been added and you can go to the Component menu and select
43. ease refer to 2 2 1 EXPRESSION Package Unzipping acesMIPS zip yields the following directories e scripts Useful scripts for preprocessing the input application and e expr EXPRESS and SIMPRESS source code e benchmarks Applications in C that we tested our system on Livermore loops and Multimedia kernels The objective is to run the applications in the benchmarks directory through the EXPRESSION framework comprising EXPRESS and SIMPRESS First copy the files in the expr directory to a suitable work directory on an i686 machine This directory is referred to as lt work gt directory in Section 2 3 Then copy the files in the scripts directory to a suitable scripts directory lt scripts_dir gt on a Sparc Solaris 2 7 machine This step is for anyone wishing to compile C applications other than those provided in the benchmark suite included with the release The EXPRESSION framework executes a C application only after it has been compiled preprocessed first using the scripts in this directory to generate two files xxx defs and xxx procs Where xxx is the name of the C file see figure below 10 EXPRESSION User Manual 2003 ACES Laboratory C program eg LL1 c H Preprocessing using scripts or online _ LL1 defs LL1 procs Now ready to execute EXPRESSION framework If you do not have a SUN Sparc machine and want to compile your own C applications for use with the EXPRESSION framework you can do
44. ease the parallelism in the datapath The parallel resource can be a single cycle multi cycle functional unit or a pipelined unit A multi cycle operation can be equivalently performed on a pipelined functional unit that will lead to an increase in the number of pipeline stages Pipelining a multi cycle operation should lead to increase in performance in cases where the multi cycle operation is extensively used A pipelined functional unit for operations running for n cycles is modeled by having one stage of class ExecuteUnit and n 1 stages of class SimpleStageUnit in the pipeline path As an illustration let s convert the 2 cycle mult operation discussed in Section 5 2 1 into a two stage pipelined operation You need to perform following changes to the architecture obtained in Section 5 2 1 1 Click on ALU2_EX box and change the timing for MultGroup to mult 1 as shown in Fig 61 EXPRESSION User Manual 2003 ACES Laboratory Properties Unit El Name Class Name Supported OpCodes Capacity CAN 1 r Selected Timing Debug_ops mult 1 LDST_Unit_ops Instruction In 1 Instruction Out Custom Properties fi ARGUMENT UNIT Figure 42 First stage of mult ALULREAD na a T e me Cl SCH LE ai __ ts EN CJ a For Help press Fi num Figure 43 Modified architecture after adding a pipeline stage 62 EXPRESSION User Manual 2003 ACES Laboratory 2 Add
45. ecture This problem is solved by the fact that besides the net list behavioral information is also present False paths due to illegal operations will never be activated since the illegal operations are omitted from the description False paths due to illegal groups of operations either in sequence or in parallel can be tackled by either specifying the common resource used by the operations to raise a conflict in 76 EXPRESSION User Manual 2003 ACES Laboratory the reservation tables or by explicitly specifying the group of operations as illegal A 6 Memory Subsystem This subsection describes the properties of the components in the memory subsystem that are required by the memory aware compiler optimizations In EXPRESSION the net list specification provides the connectivity between the various storage component and units The attributes of each storage component that are useful for the memory aware compiler optimizations are specified in this section EXPRESSION can be used to describe diverse traditional and non traditional memory systems Non traditional memory systems differ from the conventional simple memory hierarchies in several ways First the complex organization of the components in these systems may result in a partitioned address space The partitioned address space is captured using the ADDRESS RANGE parameter associated with the relevant memory units The ACCESS TIMES parameter captures the latency information for ind
46. ed e Compilation steps exist as three passes PcProGUI Expression console acesMIPS console e A complex instruction supported in the target architecture must be composed of generic instructions having same types e The compiler is not aware of the presence of the memory hierarchy But the simulator is e The register file is partitioned to the extent there is partition in the generic machine 73 EXPRESSION User Manual 2003 ACES Laboratory 8 References 1 A Khare N Savoiu A Halambi P Grun N Dutt and A Nicolau V SAT A visual specification and analysis tool for system on chip exploration Proc EUROMICRO 1999 2 P Grun A Halambi A Khare V Ganesh N Dutt and A Nicolau EXPRESSION An ADL for System Level Design Exploration 1CS Technical Report 98 29 University of California Irvine September 1999 3 P Grun A Halambi N Dutt and A Nicolau RTGEN An algorithm for automatic generation of reservation tables from architectural descriptions ISSS San Jose CA 1999 4 A Nicolau and S Novack Trailblazing A hierarchical approach to percolation scheduling ICPP St Charles IL 1993 5 P Mishra P Grun N Dutt A Nicolau Memory Subsystem Description in EXPRESSION ICS Technical Report 00 31 University of California Irvine October 2000 6 P Mishra M Mamidipaka N Dutt A Framework for Memory Subsystem Exploration CECS Technical Report 01 20 University of California Irvine
47. egister files SP PC and FP contain one register each namely sp pc and fp respectively The following table presents the Instruction Set Architecture of the generic machine 78 EXPRESSION User Manual 2003 ACES Laboratory OPCODE PARAMETERS REGISTER ACCESS FUNCTIONALITY NOP Pf No operation ICONSTANT DEST SRCI DEST R SRCI IMM DEST SRC1 Moves a constant to a register DCONSTANT DEST SRCI DEST F_EVEN SRCI DEST SRCI IMM FCONSTANT DEST SRCI DEST F SRCI IMM DEST SRCI IASSIGN DEST SRC1 DEST R SRCI R DEST SRCI Move operation ASSIGN DEST SRCI DEST SRCI DASSIGN DEST SRCI DEST F_EVEN SRCI DEST SRCI F_EVEN FASSIGN MFLO DEST SRCI DEST R SRCI HILO DEST SRCI Moves the lower bits of HILO to a register in R DEST SRCI DEST R SRC1 HILO DEST SRCI Moves the higher bits of HILO to a register in R DEST SRCI DEST HILO SRCI R DEST SRCI Moves ona bits of HILO DEST SRC1 DEST HILO SRCI R DEST SRC1 Moves a register in R to lower bits of HILO DEST SRC1 DEST R SRCI R DEST SRCI SRC2 SRC2 SRC2 R DEST SRC1 DEST F EVEN SRCI DEST SRCI SRC2 SRC2 F EVEN SRC2 F EVEN DEST SRC1 DEST F SRC1 F DEST SRCI SRC2 SRC2 SRC2 F DEST SRC1 DEST R SRCI R DEST SRCI SRC2 SRC2 SRC2 R DEST SRCI DEST F_EVEN SRC1 DEST SRC1 SRC2 SRC2 F_EVEN SRC2 F_EVEN FSUB DEST SRC1 DEST
48. er which generates the required files for the EXPRESSION toolkit in which case a Sparc Solaris 2 7 machine is not required EXPRESS is a retargetable compiler centered around a generic machine described in Appendix B An application in C is preprocessed by the GCC based front end to generate front end files lt filename gt procs and lt filename gt defs using the generic machine Instruction Set Architecture ISA EXPRESS then reads the front end files builds an Intermediate Representation IR amenable to different optimizations and targets the architecture described in an EXPRESSION ADL Architecture Description Language description The output of EXPRESS is a special assembly file named lt filename gt _DUMP_IR_AFTER_REGALLOC txt SIMPRESS reads the special assembly file simulates the running of assembly on an architecture template generated from the ADL description and finally generates area power and performance numbers including cycle count and memory usage statistics The purpose of the simulator is to assess the efficacy of the code generated by the EXPRESS compiler for the given architecture 9 EXPRESSION User Manual 2003 ACES Laboratory The EXPRESSION tool kit also comes with a GUI front end to schematically enter the architecture connectivity and instruction set description The GUI back end converts the schematic description and instruction set description into EXPRESSION ADL format For details on EXPRESSION ADL pl
49. eric Machine Legend R RO R1 R2 R31 Integer Register File RA R31 F_EVEN RO R2 R4 R30 Pair the registers to form Double Register File Note that the second odd numbered register is implicit F FO Fl F2 F31 lt lt L Logical Left Shift lt lt A Arithmetic Left Shift IMM Immediate value LAB Address of an instruction Could be actual address or name 84 EXPRESSION User Manual 2003 ACES Laboratory
50. ff ea eet LAND DEST SRC1 DEST R SRC1 R DEST SRCI amp amp SRC2 SRC2 R SRC2 IOR DEST SRC1 DEST R SRC1 R DEST SRCI II SRC2 oe E INOR DEST SRC1 DEST R SRC1 R DEST SRC1 NOR e un Sama IXOR DEST SRC1 DEST R SRC1 R DEST SRC1 XOR ee qu noie DNEG DEST SRCI DEST F_EVEN SRCI DEST 1 SRCI F_EVEN IVLOAD DEST SRC1 DEST R SRCI R DEST M SRC1 SRC2 SRC2 IMM SRC2 Loads the value in memory at address srcl src2 into destination DVLOAD DEST SRCI DEST F_EVEN SRCI SRC2 R SRC2 IMM SRC2 FVLOAD DEST SRCI DEST F SRCI R DEST MISRCI HIVLOAD DEST SRCI DEST R SRCI R DEST MISRCI SRC2 SRC2 IMM SRC2 Load only half a word DEST R SRCI R DEST MISRCI SRC2 SRC2 IMM SRC2 SRC2 R SRC2 IMM SRC2 Load 4 words QIVLOADU DEST SRCI DEST R_FOUR SRC1 SRC2 R SRC2 IMM SRC2 IVSTORE SRCI SRC2 SRCI R SRC2 IMM M SRC2 SRC3 SRC3 SRC3 R SRCI Store srcl register into memory location of address src2 src3 SRC3 IMM SRC3 R SRCI FVSTORE SRC1 SRC2 SRCI F SRC2 IMM M SRC2 SRC3 SRC3 SRC3 R SRCI HIVSTORE SRCI SRC2 SRCI R SRC2 IMM M SRC2 SRC3 82 EXPRESSION User Manual 2003 ACES Laboratory SRG SRC3 R SRCI HIVSTOREU SRCI SRC2 SRCI R SRC2 IMM M SRC2 SRC3 QIVSTORE SRCI SRC2 SRCI R_FOUR SRC2 M SRC2 SRC3 PT se PE QIVSTOREU SRCI SRC2 SRCI R_FOUR SRC2 M SRC2 S
51. hical description from the File menu Then load lt run gt acesMIPS isd by selecting Load instruction set description from the Instruction Set menu 8 After making necessary changes to the architecture save the EXPRESSION description of the architecture by clicking Save 14 EXPRESSION User Manual 2003 ACES Laboratory EXPRESSION description from the File menu Choose lt run gt acesMIPS xmd to save the ADL description 9 Exit from pcProGUI I 2x Settings For l win32 Debug General Debug C C Link Resourci MES acesMIPS Base Class Lib Sree cord 2 El acesMIPS Build System Lib ategoty General acesMIPS Derived Class Lib S E Executable for debug session DI acesMIPS Simulator Functions Lib s acesMIPS console E Ausers partha research express expression console gt acesMIPS dll Me lt Working directory acesMIPSfuncSimulator ea e AacesMIPSdikbin E d E SEH Program arguments graphWiz e pcProGUl acesMIPS amd SIM RI ASM DUMP Remote executable path and file name EE ooo DK Cancel 10 Now set expression console as the active project This project takes the EXPRESSION description in lt run gt acesMIPS xmd and generates different intermediate files required to retarget the compiler and the simulator It also generates lt run gt mem config containing memory configuration 11 Compile the project and run with the following command line options p
52. i Can LdStReadPorti GPRFile LDST READ GprReadPort8 GprReadPort8LdStReadPort2Cxn LdStReadPort2 GPRFile LDST READ GprReadPort9 GprReadPortSLdStReadPort3Cxn LdStReadPort3 LDST Ex L1 LdStReadwritePort LdStMemCxn L1Readw ritePort WEB FPRFile WbWritePort WbWritePortF prwritePortCxn FprwritePort WB GPRFile WbWwritePort WbwritePortGprwritePortCxn GprwritePort Remove All Remove Fee Cancel Figure 46 Remove pipeline paths through ALU2_READ 4 Finally we must remove the entry for ALU2_EX from the Instruction Description Section This can be done as follows e Invoke Instruction Set gt Set Instruction Description e Select the entry for ALU2_EX and remove it by Clicking e Click OK in the end 5 Save EXPRESSION description and evaluate the architectural modification Expected Result A degraded performance owing to decrease in resources 5 3 Memory Subsystem Exploration for Area Power and Performance The memory subsystem consists of data and instruction caches and main memory DRAM modules All of these components are fully 65 EXPRESSION User Manual 2003 ACES Laboratory parameterizable In this section we describe few experiments on memory exploration For further details on memory exploration please refer to 6 5 3 1 Changing Access Times Access times of caches and main memory have a big impact on system performance The access time of every memory subsystem component can be mo
53. ing a complex operation is that it gets rid of extra fetch delays Section 5 1 1 discusses how to add a complex operation 44 EXPRESSION User Manual 2003 ACES Laboratory Figure 29 Base architecture Register accessibility plays an important role in instruction set design The number of supported opcodes can be increased by decreasing the accessibility to registers However decreasing the register accessibility can lead to spilling due to increased register pressure user can study the instruction set design trade offs by varying the register accessibility of different operations Section 5 1 2 shows how to play with register accessibility A complex operation usually needs more number of input ports than the constituent simple operations Consequently addition of new operations may need addition of new read ports in the register file Instruction Selection plays a pivotal role in converting a set of simple generic operations into a complex target operation This is based on a tree based mapping rules where the priority of mapping is determined by the order of specified rules For example the rule for mac operation viz 45 EXPRESSION User Manual 2003 ACES Laboratory GENERIC IMUL DST 1 REG 1 SRC 1 REG 2 SRC 2 REG 3 MFLO DST 1 REG 4 SRC 1 REG 1 IADD DST 1 REG 5 SRC 1 REG 6 SRC 2 REG 4 TARGET mac DST 1 REG 5 SRC 1 REG 2 SRC 2
54. ions e Select NewOp and change it s Name to mac Set all the fields as shown in Fig 30 e Setthe ASM FORMAT as COND dstl reg srcl reg src2 reg src3 reg PRINT t4 t lt opcode gt t lt dst1 gt lt srcl gt lt srce2 gt lt srce3 gt n e Set the IR DUMP FORMAT as COND dstl reg srcl reg src2 reg src3 reg PRINT t4 t lt opcode gt t lt dst1 gt t lt srcl gt lt src2 gt lt src3 gt An e Click Apply and then OK to commit all the changes 4 Select set TREE MAPPING option from the Instruction Set menu and add the rule for mac operation as shown in Fig 31 Make sure the rule appears before the rules for IMUL MFLO and IADD 47 EXPRESSION User Manual 2003 ACES Laboratory Operation Groups Xx LU Unit ops mac dmfc1 dmtci cul sm pes Operandi Operand 1 Type Dperand2 Operand 2 Type ani cvts_d sac 1 y let ze y sac 2 y let ze cvt ds Operand 3 Operand 3 Type Operand 4 Operand 4 Type Gd aw sac ER let any ost y let am y trunc_w_s trunc_w_d TT a Operand1 Operand 1 Type Operand2 Operand 2 Type mfc zj zl mtel Operand3 Operand 3 Type Operand 4 Operand 4 Type sgtu gt gt sleu y Ez situ E li fr div Operand1 Operand 1 Type Operand2 Operand 2 Type mult y v and Operand3 Dperand 3 Type Operand4 Operand 4 Type Si wd z E ia SSM FORMAT CO
55. irectly to main memory Let us enumerate the steps for deleting the L2 cache using the GUI 1 Load acesMIPS gmd and acesMIPS isd using Load Graphical Machine Description and Load Instruction Set Description respectively 2 Click on the L2 cache and press DELETE This removes the cache as well as all of its connections 3 From the Components menu select Edit datapaths Click on the three storage paths which have a reference to L2 to select them and then click on Remove repeatedly till all three of the references are deleted 69 EXPRESSION User Manual 2003 ACES Laboratory Data Paths le ALUT_ prReadPort6 FprReadPort64LU1ReadPortl AlulReadPortl FPRFile ALU1_READ FprReadPort FprReadPort7ALU1ReadPort2Cxn AlulReadPort2 FPRFile 4LU2_READ FprReadPortl FprReadPortl4lu2ReadPortl Con Alu2ReadPortl FPRFile 4LU2_ READ FprReadPort2 FprReadPort24lu2ReadPort2Cxn Alu2ReadPort2 FPRFile FALU_READ FprReadPort3 FprReadPort3FaluReadPortl Can FaluReadPortl FPRFile FALU READ FprReadPort4 FprReadPort4FaluReadPort2Cxn FaluReadPort2 FPRFile LDST READ FprReadPort5 FprReadPortBLdStReadPort3Cxn LdStReadPort3 GPRFile ALUT READ GprReadPortl GprReadPorti lulReadPorti Con AlulReadPorti GPRFile ALUT READ GprReadPort2 GprReadPort24lu1ReadPort2Cxn Alul ReadPort2 GPRFile ALU2_READ GprReadPort3 GprReadPort34lu2ReadPortl Can Alu2ReadPortl GPRFile ALU2_READ GprReadPort4 GprReadPort44lu2ReadPort2Cxn Alu2ReadPort2 GPRFile BR
56. ividual memory components in the architecture The latency associated with any level of the memory hierarchy can be easily computed using the hierarchy information from the net list and the ACCESS TIMES attribute of each component Second they may contain novel components e g SDRAM Frame buffer Stream buffer etc These components may allow for varying access times depending on the mode of access The user can specify this feature as a list of access times in the ACCESS TIMES attribute of the component The language provides certain pre defined parameters like TYPE SIZE etc The TYPE parameter is used to identify each storage component EXPRESSION contains certain predefined types like REGFILE DRAM CACHE SRAM etc New types can be easily added as a user defined type The user can also add new parameters in order to specify the features of novel components for use by the compiler optimizations Note that some parameters like the number and type of ports associated with each component are described in the Components Specification subsection For details on memory subsystem description please refer to 5 The EXPRESSION ADL description of the acesMIPS is available in lt work gt acesMIPSD11 bin Example_acesMIPS xmd 11 EXPRESSION User Manual 2003 ACES Laboratory Appendix B Generic Machine Model The front end of the retargetable EXPRESS compiler translates the input application in C to generic instructions and generic o
57. l be displayed Note that o is not enabled unless you select highlight an operation group For instance on selecting ALU_Unit_Ops and double clicking on it we see the dialog below dmfci Name Ja UC Unit ops Op Type Apply dmtct N Behavior AAA AAA Cl sm xor E and 4 Operand1 Operand 1 Type Operand2 Operand 2 Type cui si Cat ds E z z cvt_d_w Operand3 Operand 3 Type Operand4 Operand 4 Type trunc_w_s rl od y py trunc_w_d mfhi EM fl e E Operand 1 Operand 1 Type Operand 2 Operand 2 Type mtel y E z sgtu Operand 3 Operand 3 Type Operand 4 Dperand 4 Type sleu situ E z x li r div mult Operand 1 Operand 1 Type Operand 2 Operand 2 Type and z A Dperand 3 Operand 3 Type Operand4 Operand 4 Type xori andi a SI z q nd ASM FORMAT IR DUMP FORMAT a al Aly Cancel Figure 16 Operations supported by ALU Unit 33 EXPRESSION User Manual 2003 ACES Laboratory Clicking on any of the opcodes shown brings up its properties on the right hand side of the dialog Operation Groups x Name and Op Type DATA y Apply Behavior _DEST_ SOURCE 1 AND SOURCE 2 lv Dperand 1 Operand 1 Type Dperand 2 Operand 2 Type sac 1 y Jet any DI src 2 y fint_any gt Operand3 Operand 3 Type Operand4 Operand 4 Type trunc_w_s ost y let am y
58. le process of architecture entry in detail Section 5 is especially important for designers who want to play around with a base architecture and explore interesting design points in the architecture 5 EXPRESSION User Manual 2003 ACES Laboratory Tool Set up Cmd Line Options rt Enr rome mme Section 6 presents the benchmarks used to evaluate the framework on a base architecture Section 7 talks about the open issues and directions for research in this framework Finally Section 8 provides useful references for anyone interested in understanding the theory behind the framework Appendix A briefly describes different sections of EXPRESSION ADL language The detailed description of the EXPRESSION language can be found in 2 Appendix B describes the generic machine model used by our retargetable compiler 6 EXPRESSION User Manual 2003 ACES Laboratory 1 2 Exploration Features Supported in This Release Release 1 0 of the EXPRESSION toolkit supports the following exploration features 1 ISA Exploration Adding new complex instructions Section 5 1 1 Changing register accessibility Section 5 1 2 2 Pipeline Exploration Adding a single multi cycle functional unit Section 5 2 1 e Adding a new pipelined functional unit Section 5 2 2 Deleting a pipeline path Section 5 2 3 3 Memory Subsystem Exploration Modifying access times of caches memories Section 5 3 1 Modifying
59. me Name of the connection Class Name Class that the port belongs to Must be either 26 EXPRESSION User Manual 2003 ACES Laboratory RegisterConnection for a connection between units and a register file or MemoryConnection for a connection between unit and a memory element Custom Properties Other miscellaneous properties 4 1 1 5 Storage Register File Properties Storage E Type o Register File C ICache DCache SRAM ls C DRAM Name Class Name GPR File Storage Width Size 32 32 Associativity Cache Lines D DER Custom Properties hs dia cm CAPACITY 32 Address Range Mnemonic From fo A Te E Urda Apply Figure 10 Register File Properties A register file storage element is characterized by properties that are displayed in the Properties window that is displayed when the storage element is clicked on the screen The fields shown above are described below Name Name of register file Class Name Class of the register file Can only be Storage Width Width of register file in bits 27 EXPRESSION User Manual 2003 ACES Laboratory Size Number of registers in the register file Mnemonic Prefix to be used for the registers in assembly formats For example General Purpose Register file has registers with prefix R Custom Properties Other miscellaneous properties 4 1 1 6 Storage Cache Properties Storage ES Type C Register File C
60. mm II PRINT te opcodez AP IR DUMP FORMAT COND det eg srcl reg src2 reg PRINT t4 t lt opcode gt AE del COND dstl reg stcl reg ste2 imm PRINT M opcode E xo s gt la SS Figure 37 Destination operand mapping for mult 5 Save Expression description into acesMIPS xmd 6 Repeat the steps described in Section 2 3 to evaluate the modified architecture Expected Result If you run EXPRESS with pIList you can check the instruction list generated on the console window to find the register allocated for the destination operand of mult operation You should observe that any of the odd register in the set GPRFile 1 29 is allocated Similar is the case for the source operand of mflo and mfhi operations The performance however wouldn t have been affected The performance can be affected 55 EXPRESSION User Manual 2003 ACES Laboratory adversely by reducing the register accessibility of the operations to an extent that results in spilling of registers 5 2 Pipeline Exploration An architecture can be modified by changing its pipeline The pipeline changes can be made by just adding a new functional unit and have a new pipeline path go through the functional unit The number of existing pipeline paths can also be reduced by deleting the resources 5 2 1 Adding a Single cycle Multi cycle Functional unit A new functional unit can be added as a single cycle multi cycle or a pipelined
61. n the toolbar located on the left and click on the screen to place it Note that if you want to resize the unit make sure that it lies completely within the screen 4 2 2 Adding Storage Figure 23 Add Storage Adding storage elements is similar to adding units Just click on the Add Storage button on the toolbar located on the left and click on the screen to place it Note that if you want to resize the unit make sure that it lies completely within the screen 39 EXPRESSION User Manual 2003 ACES Laboratory 4 2 3 Adding Latch Figure 24 Add Latch After clicking on the appropriate button you must click within a unit already present on the screen to add a latch to it A latch outside a unit or within a storage element does not have any significance 4 2 4 Adding Port Figure 25 Add Port After clicking on the appropriate button you must click within a unit or a storage element already present on the screen to add a port to it A port outside a unit or a storage element does not have any significance 40 EXPRESSION User Manual 2003 ACES Laboratory 4 2 5 Adding Connection Add connection Figure 26 Add Connection A connection can be added between 1 a latch in a unit and another unit In this case the latch should be an output latch and adding the connection means implicitly adding an input latch to the target unit 2 a port and another port unit gt storage or storage gt unit In this cas
62. ode contains mac instruction instead of a chain of mult mflo and addu instructions This will lead to an increase in performance because of reduction in the number of fetches 5 1 2 Changing Register Accessibility Individual operands of each operation are mapped to particular register classes These register classes effectively partition the register file and have a unique mapping to a particular set of registers There is a fixed set of generic register classes expressed as class types and data types Target register classes are specified by invoking set VAR GROUP from the Instruction Set menu Each target register class has a unique mapping to a set of target registers The mappings of the generic register classes to the target register classes are specified by selecting set OPERAND MAPPING The register accessibilities of the operands of operations are changed from set OP_GROUPS option in the Instruction Set menu In acesMIPS architecture the target register class for the destination of mult operation is int_hilo which maps to any of the registers in GPRFile 1 28 Let us modify the register accessibility of the destination operand of mult and source operands of mfhi and mflo Suppose we want these operands to access only the odd numbered registers in GPRFile 1 29 We perform the following steps 1 Load acesMIPS into the GUI 2 Create a new register class int_odd using set VAR_GROUPS from th
63. perands The machine comprising of a generic instruction set 1s called Generic Machine Instruction Selection and Register Allocation phases of the compiler transform the code from the generic to the instruction set of the target architecture The phases of a retargetable compiler can be divided into two distinct phases first the generic machine independent compiler phases and second the machine dependent compiler phases The machine independent phases optimize the code for generic machine and are applied for any target machine The Generic Machine has a RISC ISA very similar to MIPS ISA The operations of Generic Machine their functionality and the register accessibility of operands are explained in the table below The Generic Machine has three data types namely Integer Double and Floats It has 7 register files R F CC SP FP PC HILO R is the integer register file and F is the floating point register file The pairs of F registers are used for Double data types The double operands are accessed by the even numbered registers The second odd numbered register is an implicit operands in such operations HILO is a special 64 bit register The multiply operations write the result in this register Later the results are extracted from this register file using the operations MFLO and MFHI The CC is a separate register file into which the evaluation results of conditionals are written The r
64. ps Dahun on q H Instruction In Instruction Out Custom Properties 2 en be Undo Apply Figure 5 DecodeUnit An example of Unit The Unit is characterized by properties that are displayed in the Properties window that is displayed when any Unit is clicked on the screen The fields shown above are described below Name name of the unit Class Name name of the class the unit belongs to A unit can be one of FetchUnit DecodeUnit OpreadUnit Execute Unit BranchUnit LoadStoreUnit and WriteBackUnit 22 EXPRESSION User Manual 2003 ACES Laboratory Supported Opcodes The opcode groups supported by the unit these groups are specified from the set OP_GROUPS item in the Instruction Set menu More on that later Capacity Capacity of the Instruction Buffer in the unit Timing The time it takes for an instruction to pass through a Unit all 1 means that all opcodes passing through this stage takes 1 cycle You can specify how much time an individual opcode takes in the unit by appending lt opcode name gt lt time gt in the text box above For example if the mac instruction takes 2 cycles to execute in one of the ALU execute units then we can specify it as shown below Properties Unit he E Be ix Name Class Name Supported OpCodes Capacity C All fi rt Selected Timing all 1 mac 2 Instruction In eo Instruction Out Custom Properties fi ARGUMENT UNIT wo 7
65. rchitecture exploration The users should be able to follow the steps enumerated in the following sections and explore various interesting design points starting from any base architecture For each exploration we also present the results to be expected The following subsections will take you through a tour of explorations starting from acesMIPS architecture as a base architecture Fig 29 shows a snapshot of the base architecture It has five pipeline stages Fetch Decode Operand Read Execute and Writeback The Operand Read and Execute stages have five parallel pipeline paths ALU1 ALU2 Floating Point Branch and Load Store It has two register files integer and float It has two level of cache hierarchy with common L2 for both data and instruction It also uses SRAM as a scratch pad memory 5 1 ISA Exploration Most of the instructions of a target machine are obtained from the generic instruction set by one to one mapping of generic to target operations When two or more generic operations combine together to form a target operation we call the target operation a complex operation The target instruction set can be made richer by incorporation of large number of useful complex operations A complex operation is useful for a given application when a sequence of operations forming the complex operation is frequently used A profiler can come up with useful complex operations to be added to a base instruction set Another advantage of add
66. re described below 29 EXPRESSION User Manual 2003 ACES Laboratory Name Name of memory Class Name Class of the memory element Can only be Storage Access Time Number of cycles to access data in memory Address Range Range of addresses associated with memory It is used by the memory controller to decide from where to fetch instruction data Custom Properties Other miscellaneous properties Now that you are familiar with the architecture layout let s look at the Instruction Set description 4 1 2 Instruction Set Specification Go to the Instruction Set menu and select the Load Instruction Set Description option Select and load the file acesMIPS isd ES EXPRESS SIM View C File Components Instruction Set Simulator View Window Help olal XIE Setvar coups DI Set OP_GROUPS Set OPERAND_MAPPING Set TREE_MAPPING Letz k lolulglalelp z N Class Name FETCH FetchUnit Supported OpCodes Capacity All H O Selected Timing 1 Instruction In Custom Properties A Start 38 Windows Explorer 99 acesMIPS Microsof a Visual SourceSafe E 5 Inbox Outlook Esp ES EXPRESSION user BN EXPRESS Console IS Express Tano o DI 2 50AM Figure 13 acesMIPS loaded with Instruction Set 30 EXPRESSION User Manual 2003 ACES Laboratory The loaded instruction set can be changed by invoking different options from the Instruction Set menu The ch
67. rogram arguments acesMIPS xmd SIM RI ASM DUMP The debug tab in the settings window is used for specifying the command line options as shown below 12 Set acesMIPS console as the active project This project contains both EXPRESS compiler and SIMPRESS simulator The various command line options available are discussed in the next section 15 EXPRESSION User Manual 2003 ACES Laboratory 13 Compile the project set the command line options discussed in Section 3 and run the project The acesMIPS console application generates the number of cycles memory usage and other statistics in lt run gt lt filename gt pwrStats These performance numbers will guide the designers to make favorable choices to steer the effective exploration of the architectural design space 16 EXPRESSION User Manual 2003 ACES Laboratory 3 Command Line Options EXPRESS switches that are supported in this release e ISel Instruction Selection Convert a set of generic opcodes into a set of target opcodes e RA Register Allocation Each operand in an instruction is bound to a target register based on its register accessibility e Tbz Trailblazing Percolation Scheduling Perform Trailblazing 4 assuming that the latency of each operation is 1 cycle e PipeTbz Pipelined Trailblazing Based on the reservation tables automatically generated 3 from the datapath perform Trailblazing Percolation Scheduling e EXPR ENAME
68. t lt src2 gt 1n COND dstl reg srcl reg src2 imm PRINT Nt lt opcode gt 1t lt dst1 gt lt srcl gt lt src2 gt 1n Set the IR DUMP FORMAT as follows COND dstl reg srcl reg src2 reg PRINT t4 t lt opcode gt t lt dst1 gt t lt srcl gt lt srce2 gt n COND dstl reg srcl reg src2 imm PRINT t4 t lt opcode gt t lt dst1 gt t lt srcl gt lt src2 gt n 6 Click Apply and then OK to commit the changes 7 Select set OP GROUPS from Instruction Set menu Double click on ALU1_Unit_Ops to list the supported operations Select mult from the list and delete it by clicking o 8 Click OK to effect the change 58 EXPRESSION User Manual 2003 ACES Laboratory Operation Groups Ez r Name mut Dp Type DATA DI Apply Behavior _DEST_ _SOURCE_1_ _SOURCE_2_ Ch Operandi Operand 1 Type Operand 2 Operand 2 Type at SAC 1 2 fint any sac2 y fint any DI Operand3 Operand 3 Type Operand 4 Operand 4 Type DST y ler he DI DI DI m z Operand1 Operand 1 Type Operand2 Operand 2 Type Operand3 Operand 3 Type Operand4 Operand 4 Type Operand1 Operand 1 Type Operand2 Dperand 2 Type Operand3 Dperand 3 Type Operand4 Operand 4 Type L ASM FORMAT COND det reg stcl reg src2 reg PRINT t lt opcode gt t lt dst1 gt lt COND dstl reg stc 1 reg stc2 imm PR
69. the Edit datapath menu item to see the datapath You can also remove a datapath by highlighting it and pressing the Remove button To add a data path between two storage elements click on the appropriate button to add datapath Next click on the first storage element Click on the connection from that element to the second storage element Finally click on 42 EXPRESSION User Manual 2003 ACES Laboratory the second storage element To finish adding the datapath right click anywhere on the screen make sure not to click on any component on the screen The datapath has now been added and you can go to the Component menu and select the Edit datapath menu item to see the datapath You can also remove a datapath by highlighting it and pressing the Remove button 43 EXPRESSION User Manual 2003 ACES Laboratory 5 Design Space Exploration We present in this section some of the exploration directions which are important for a system designer An architectural modification can affect another architectural change positively or negatively So the designer has to do a trade off between different performance goals respecting the architectural constraints For details on how to perform design space exploration of an architecture having processor co processor and memory subsystem please refer to 7 8 9 The different architecture explorations comprise Instruction Set Architecture exploration Micro architecture exploration and Memory a
70. this compilation online at A http www cecs uci edu cgi bin cgiwrap sudeep file_upload cgi by uploading the C application to the server which generates the required files two files xxx defs and xxx procs for the EXPRESSION toolkit Note that the included benchmarks are already compiled preprocessed the corresponding defs and procs files for each application are included in the benchmarks directory and if you just plan to use these benchmarks you can ignore the scripts directory and the entire preprocessing phase Section 2 2 2 2 Preprocessing the Application An application in C lt filename gt c in the lt sun_work gt directory is first translated into two files e lt filename gt procs containing the text section in GENERIC assembly and e lt filename gt defs containing the data section of the program The conversions from lt filename gt c to lt filename gt procs and lt filename gt defs files are done on a Sparc Solaris 2 7 machine using 11 EXPRESSION User Manual 2003 ACES Laboratory a GCC based front end tool provided in the release package Before running the scripts go to lt scripts_dir gt Give execute permissions to the files in the lt scripts_dir gt directory Copy the benchmarks lt filename gt c to be run to a work directory lt sun_work gt Open mips2expr all in an editor and set the variable SCR PATH to the path to lt scripts_dir gt Save mips2expr all and change directory to
71. unit Custom Properties Other miscellaneous properties 4 1 1 3 Port Name Class Name But ReadPort1 UnitPort m Port Type Read C Write C Read Write Custom Properties ARGUMENT _SOURCE_1_ CAPACITY 1 Un do Apply Figure 8 Setting Port Properties A port is characterized by properties that are displayed in the Properties window that is displayed when any port is clicked on the screen The fields shown above are described below Name Name of the port Class Name Class that the port belongs to Must be either Port for ports bound with storage elements or UnitPort for ports bound with units 25 EXPRESSION User Manual 2003 ACES Laboratory Port Type Specifies whether the port is a read write or a read write Port Custom Properties Other miscellaneous properties For instance ARGUMENT _SOURCE_1_ indicates that this port will serve as the first source for operations that are associated with the ALU1 unit CAPACITY 1 indicates that only 1 value can be read from the port 4 1 1 4 Connection Properties Connection E Name Class Name E prReadPort24lu1ReadPort2 RegisterConnection Custom Properties BI i ido A Pp m Figure 9 Setting Connection Properties A connection is characterized by properties that are displayed in the Properties window that is displayed when any connection is clicked on the screen The fields shown above are described below Na
72. wed storage types associated to source and destination operands of an opcode See Set OP GROUPS These var groups are also used while setting OPERAND_MAPPINGS 31 EXPRESSION User Manual 2003 ACES Laboratory Name Name of the aggregate of storage elements Datatype Type of data associated with the group Components Storage elements actually associated with the group 4 1 2 2 set OP_GROUPS E Name Op Type Apply Behavior e Operand1 Operand 1 Type Dperand 2 Operand 2 Type Operand3 Operand 3 Type Operand4 Operand 4 Type Fr Operand 1 Operand 1 Type Operand 2 Operand 2 Type Operand 3 Dperand 3 Type Operand4 Dperand 4 Type r Operand1 Operand 1 Type Operand2 Dperand 2 Type Operand3 Operand 3 Type Operand4 Operand 4 Type EE er ASM FORMAT IR DUMP FORMAT Operation Groups Cf 5 gt Carci Figure 15 Setting OP GROUPS This section specifies the opcodes in the instruction set of the architecture and groups them together into various opcode groups 32 EXPRESSION User Manual O 2003 ACES Laboratory Pressing g as shown in Figure 15 allows the addition of a group 0 allows adding opcodes within a group The fields on the right hand side become enabled once an item in the text box on the left hand side is selected To select an op group simply click on it On double clicking it the list of opcodes contained in the group wil

Download Pdf Manuals

image

Related Search

Related Contents

jane des libéraux ïalaisans  Mode d`emploi 46 705 25 + 57  取扱説明書(PDF)  View PDF - Thermsaver Heating Solutions  DEWALT DXCMLA3706056 Use and Care Manual  

Copyright © All rights reserved.
Failed to retrieve file