Home

Motorola SC140 Network Card User Manual

1. 27 command prompt DOS or UNIX If you desire you can use an Integrated Development Environment IDE Be sure to consult the appropriate IDE manuals This document provides step by step instructions to walk you through the exercises included in the file SC140 exercises zip You can download this zip file from the MSC8101 or MSC8102 product page at the following Web address http www mot com SPS Solutions to the exercises are provided at the end of this application note Recommended Reading SC140 Core Reference Manual MNSC140CORE D SC100 C C Compiler User s Manual MSC100CCUM D e StarCore140 Family DSP Core Instruction Set STCR140ISRM D Multisample Programming Technique STCR140MLTAN D 2 o o F o lt o a o 9 2 o 5 o E i MOTOROLA 2 Motorola Inc 2000 d qi tal dna The following StarCore software development tools were used in the development of the SC140 exercises Later versions of the SC140 tools should generate similar or better results e Version 1 0 StarCore 100 C Compiler Produces highly optimized code Compiler features include ANSI C standard compliance fixed point optimization global optimization and a standard C library e Version 6 3 44 StarCore 100 Assembler Translates assembly language files into machine readable object files sc100 ld Linker Links and relocates the object files and produces executable progra
2. The first parameter is passed in dO if it is a numeric scalar or in T if it is an address second parameter is passed in 41 if it is a numeric scalar or in rl if it is an address Subsequent parameters are pushed onto the stack return value if any is passed back to the calling function in 40 if it is a numeric scalar or in rO if it is an address For simple functions with two parameters or fewer the stack is not used to pass parameters and it may be possible to write the entire assembly language function without explicitly using the stack at all In general however the stack is used to pass parameters into the function and to store local variables Its contents are as shown in Figure 11 Just prior to the function call parameters 3 4 5 and so on are pushed onto the stack in reverse order and parameters 1 and 2 are stored in d0 r0 and d1 r1 as described previously The function is then called and the return address and status register contents are pushed onto the stack by the jsr or bsr instruction If the called function modifies register d6 d7 r6 or r7 it should first save them on the stack and then restore them before returning All other registers are free for use without saving or restoring them The calling function must save these registers if it needs their values to be preserved On function exit the status register contents and return address are popped from the stack by the rts instruction and
3. Conditional execution of instructions includes JFT IFF IF True bit is True False e IFA IF always which is unconditionally executed with IFT IFF The conditional execution set combinations are very flexible and are represented in Figure 10 which represents the maximum number of ALUS that is two and one Arithmetic Address Unit AAU per subset The C compiler automatically generates the conditional execution set and some examples are provided to highlight potential code optimization Figure 10 Control Instructions Using the True Bit M MOTOROLA Introduction to the SC140 Tools 21 Compiler Support on StarCore Hands On 1 2 3 4 5 10 11 12 13 Open the example Ex7 c file Understand the conditional test in the code Compile the project with the Ot2 and S options Open the generated assembly file Ex7 s1 and look at the conditional instructions within the loop In the box provided here write down how many execution sets are within the loop Optimized for Time Optimized for Space Recompile using the compiler optimization option for code size Os option Open the generated assembly file Ex7 81 and look at the conditional instructions within the loop Write down how many execution sets are within the loop in the box Save Ex7 c as Ex7 1 c Modify the program to obtain two cycles within the loop Tip consider using a temporary variable for both storing the immediate
4. Expected do 0000 0123 do 0000 0123 d1 oo 0000 Zama 0123 04 0000 0123 00 00 00 d2 oo 0123 0000 00 00 00 0128 0000 oo 0000 4507 89AB 06 FFFF 89AB d8 oo 0123 4567 d9 89AB CDEF FF d7 FF FFFF CDEF F Aligned ntroduction to the SC140 Tools Simulator ed 0 0 0 0 0 0 0 0000 4567 0 0 po 0000 4567 00 4567 0000 00 0000 0123 00 0000 FF FFFF 89AB FFFF CDEF FF 00 0123 4567 FF M MOTOROLA Compiler Support on StarCore data 0x01 0x23 0x45 0x67 0x89 OxAB OxCD OxEF OxBB OxDD OxEE OxFF 0x11 0x22 Expected Simulator move data 2 r0 ro 00 5 move 2f r0 d2 d3 d2 od 4567 0000 d 0123 0000 z move 4w r0 d4 45 46 47 d4 oq 0000 4567 d 0000 0123 d5 sw delel FFFF coer dpi Tet Aer move 2 r0 d8 d9 d8 oj 4567 agaB od 0123 4567 d ni ODE Ass Aligned Not Aligned move 2w r0 d0 d1 m The crosses indicate that the results provided by the simulator are not aligned operations If this is not taken into account unpredictable results can occur when migrating to the hardware which requires aligned data M MOTOROLA Introduction to the SC140 Tools 29 Compiler Support on StarCore Exercise 5 Yxkkxkxkxkkxkxkxkkkkkkkkkkkkkkkkkikkkkkkkkkkikkkkkkkkkkkkikkkkkkkkkkkkikkkkkkkkkkkkkkk k x
5. Examine the assembly language file Ex6 s1 to see hovv the inner loop is compiled Intermediate Version Compromise Between Memory and Speed 4 5 Save Ex6 c as Ex6 1 c Change the C code of Ex6 1 c according to the following steps a Process the first four samples at a time Replace the implementation of y n a x n with the eguations defined as Group 0 in Eguation 5 b Replace x n x n 1 x n 2 x n 3 with variables for example var0 varl var2 var3 respectively as follows 50 ali var0 51 ali var res2 ali var2 res3 ali var3 Group 0 This processes the first group Group 0 To process the remaining groups Group 1 and so on the values from var0 var1 and var2 from Group 0 must be transferred to varl var2 var3 respectively for processing Group 1 c Transfer the values in varl var2 and var3 and load the new sample x n 1 into var0 Compile the code with the Ot2 option and run the code to verify that the correct output values are obtained Recompile Ex6 1 c using the Ot2 and S options The inner loop should be only two cycles long If not return to Step 5 During each iteration of the loop the coefficient a i is loaded into a data register The data value x n 1 i is loaded into another data register The values in the other three registers are reused but they must first be transferred into the registers where the four MAC instructions expect t
6. amp array1 0 amp array2 0 Figure 6 Files for the Local Versus Global Optimization Exercise 1 Open the two files and understand their functionality Local Optimization 2 Compile the two files ccsc100 Ot2 Ex3 main c Ex3 prod c o Ex3 eld 3 Run the code runsc100 t Ex3 eld The t option for runsc100 enables the cycle count generation Write the cycle count in the box below Local Optimization Default Mode Cycle Count Global Optimization 4 Compile the files using global optimization ccsc100 Ot2 Og Ex3 main c Ex3 prod c o Ex3 glo eld where Og is the global optimization option 5 Run the code runsc100 t Ex3 glo eld Write the cycle count in the box below Global Optimization Og option Cycle Count M MOTOROLA Introduction to the SC140 Tools 11 Compiler Support on StarCore To understand hovv global optimization makes best use of available information perform these steps 6 Recompile the application with S option Stop After Compilation and with the local optimization ccsc100 Ot2 Ex3 main c Ex3 prod c S 7 Rename the s1 files as Ex3 mainl sland Ex3 prodl sl 8 Open the files to see vvhat the compiler has produced 9 Enable global optimization ccsc100 Ot2 Og Ex3 main c Ex3 prod c S 10 Open Ex3 main sl to see what the compiler has produced Since the compiler has all information on the application it optimizes the application further than vvith local opt
7. 0x0F80 x y 10 OxOFCO x y 11 0x0F80 x y 12 0x0F40 x y 13 0x0F00 x y 14 OxOECO x y 15 0x0E80 x y 16 0x0E40 x y 17 0x0E00 M MOTOROLA Introduction to the SC140 Tools 31 Compiler Support on StarCore y 18 yl19 y1201 yl21 yl22 yl23 yl24 yl25 yl26 yl27 yl28 y1291 y1301 y 31 xxkkkkkkkkxkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkx main 32 0x0DCO 0x0D80 Ox0D40 Ox0D00 0x0CCO 0x0C80 0x0C40 0x0CO00 0xOBCO 0x0B80 0x0B40 Ox0B00 0xO0ACO 0x0A80 long res0 resl res2 short var0 varl var2 short n x ptr for n 0 res0 resl res2 res3 var3 var2 varl var0 i x ptr input 14 n 32 n 4 o o x ptr x ptr x ptr x ptr res3 var3 var3 var3 var3 var3 x ptr points to input 11 which is x 3 x n 3 x n 2 x n 1 x n x ptr now points to x n 1 for i 0 i 12 i res0 resi res2 res3 var3 var2 var res0 resl res2 L mac res3 var2 varl VaT alil alil alil alil varo var var2 var3 ntroduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore var0 x ptr var0 x n i 1 Truncate re
8. address generation hardware particularly for modulo addressing For example let us consider the move 4w Rx Dk instruction more specifically move 4w R0 D0 D1 D2 D3 four 16 bit words are moved from the memory address of RO into the data registers DO D1 D2 and D3 respectively The data must align on an 8 byte boundary so the address contained in RO should be a multiple of eight The examples in Figure 8 further illustrate this point Aligned Not Aligned Bringing one word from memory e move w r0 d0 where rO 0x0 or 0x2 e move w r0 d0 where rO 0x1 or 0x3 P 0x00 AA BB CC DD P 0x00 AA BB CC DD L 1 lt P gt Aligned on a 2 byte boundary m Not Aligned on a 2 byte boundary Correct Operation brings either AABB Erronenous Operation brings wrong if RO 0x0 or CCDD if RO 0x2 data in dO Bringing two words from memory move 2w r0 d0 d1 where 0x0 or 0x4 e move 2w r0 d0 d1 where rO 0x1 0x2 or 0x3 P 0x00 AA BB CC DD EE P 0x00 AA BB CC DD EE ji gt Aligned on 4 byte boundary lt P gt _ Not Aligned on a 4 byte boundary Correct Operation brings AABB CCDD Erronenous Operation brings wrong if RO 0x0 data in d0 d1 Figure 8 Alignment Considerations M MOTOROLA Introduction to the SC140 Tools 13 Compiler Support on StarCore The following instructions reguire data to be aligned on the specified boundaries move vv r0 d0 2 byte boundary move f r0 d0 2 byte boundary move
9. ow ee l 5 Pur mee move 2w r 004 wi 27 mE qp E move 2t 0 dzid3 etl JIJU TT di up cun 1 move 4w r0 d4 d5 d6 d7 844 a cmt 41 Ti 41 55 k eb st e 1 H s0 T T Second Code Section 4 Compile the Ex4 c file ccsc100 be Ex4 c o Ex4 eld The Big Endian be option is used in this exercise to make it easier to read the data in the simulator memory window If desired the Little Endian mode can also be used 5 Run the GUI simulator guisc100 In the simulator command window type reset d m1 to put the simulator in Big Endian mode Open an assembly window Windows gt Assembly Load the file Load Ex4 eld Set a breakpoint on main by typing break _main into the command window D05 A Type go The code should now be at the start of main 10 Open a memory window Windows gt Memory and click OK 11 Type data into the Scroll box of the memory window to display the contents of the array datal defined in Ex4 c Verify that these contents are as expected M MOTOROLA Introduction to the SC140 Tools 15 Compiler Support on StarCore 12 Type next to step through the code 13 Look at the register contents in the session window and write the values in the Simulator Columns boxes above for both sections Congratulations you have completed Exercise 4 Good To Knovv e Unaligned data accesses lead to erroneous results You must consider thes
10. value of the arrayl and the conditional test Compile the code using the Ot2 option Open the file Ex7 1 s1 If you have obtained tvvo eycles for the inner loop congratulations If you have not please try again In the following box write the optimized C code C Code Generated Assembly Code 22 Introduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore 8 Calling an Assembly Routine From C Exercise Practical DSP application commonly use a mixture of C and assembly language This exercise shows how an assembly language function can be called from C code The code for this exercise is contained in two files Ex8 c and addvecs asm The C code in Ex8 c calls the assembly language function addvecs in file addvecs asm to add two vectors together and return the sum of all the elements of the resultant vector The prototype for addvecs is as follows short add vecs short x Input vector short yl l Input vector short 211 Output vector short length Length of vectors Four parameters are passed to addvecs The first three are pointers to arrays and are therefore 32 bit values addresses are 32 bits in StarCore The fourth parameter is the length of the vectors and is a 16 bit value The mechanism by which parameters are passed is specified in the application binary interface ABD Generally speaking this ABI specifies the following calling convention
11. 0 0x1A00 0x1B00 Ox1C00 Ox1D00 Ox1E00 Ox1F00 0x2000 short y1321 BORK KR RK KR kk kk ck kk Sk ke RR RK KR k k Sk Sk YK YK e kk kk kk OK k k RR k k KK RR kk ke k k For reference the following output should be observed after running the code kk kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk kkkkkk kkkkkxk M MOTOROLA Introduction to the SC140 Tools 33 Compiler Support on StarCore y 0 0x0020 k y 1 0x0080 y 2 0x0140 y 3 0x0280 y 4 0x0460 y 5 0x0700 y 6 0x0A80 y 7 0x0D00 y 8 OxOEAO y 9 0x0F80 y 10 OxOFCO y 11 OxOF80 y 12 0x0F40 y 13 Ox0F00 y 14 OxOECO y 15 0x0E80 y 16 Ox0E40 y 17 OxOEO0 y 18 OxODCO y 19 OxOD80 y 20 0x0D40 y 21 0x0D00 y 22 0x0CCO y 23 0x0C80 y 24 0x0C40 y 25 0x0C00 y 26 OxOBCO y 27 0x0B80 y 28 0x0B40 y 29 0x0B00 y 30 OxOACO y 31 0x0A80 xxkxkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkx main long res0 resl res2 res3 short var0 varl var2 var3 short n i x ptr X ptr amp input 14 x ptr points to input 14 which is x 3 for n 0 n lt 32 n 4 0 resi res2 o res3 34 Introduction to the SC140 Tools M MOTOROLA var3 Var2 varl va
12. 00 0000 0000 OxF000 4096 0 125 2 1 Hardware Support on StarCore StarCore has a dual instruction set for operations that produce different results depending on whether fractional or inte ger arithmetic is used The instruction set is complementary when an integer or a fractional operation leads to the same result regardless of the operation type for example an addition The instruction set is dual as shown in Table 2 in two cases which automatically take care of data alignment zero filling and sign extension e when an integer or a fractional operation leads to a different result depending on the operation type for example a multiplication when data is transferred from to memory Table 2 Fractional and Integer Assembly Language Instructions Operation Integer Fractional Multiply impy mpy Multiply accumulate imac mac Move move b move w move 2w move 4w move f move 2f move 4f M MOTOROLA Introduction to the SC140 Tools Compiler Support on StarCore 2 2 Compiler Support on StarCore The StarCore compiler implements fractional arithmetic using built in intrinsic functions based on integer data types Any fractional values or constants must therefore be defined using their integer equivalent Useful relationships for deriving these integer representations from the fractional vales are as follovvs 16 bit Integer Value Fractional Value 2 2 32 bit Integer Value
13. 1999 MOTOROLA INC x INTRODUCTION TO THE SC140 TOOLS x Developed by MOTOROLA SPS NCSG NISD xkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkik kk kk kk Multi sample technique Exercise on an FIR Filter Hinclude lt stdio h gt include lt prototype h gt short a 12 0x1000 0x2000 0x3000 0x4000 0x5000 0x6000 0x7000 0x8000 0x9000 0xA000 0xB000 0xC000 short input 32 11 0 0 0 0 0 0 0 0 0 0 0 zero padding 0x0100 0x0200 0x0300 0x0400 0x0500 0x0600 0x0700 0 0800 0x0900 0x0A00 0x0B00 0x0C00 0x0D00 0x0E00 0x0F00 0x1000 0x1100 0x1200 0x1300 0x1400 0x1500 0x1600 0x1700 0x1800 0x1900 0x1A00 0x1B00 Ox1C00 Ox1D00 Ox1E00 Ox1F00 0x2000 short y 32 222 2 2 22 2 2 22 2 22 2 ke ke RK For reference the following output should be observed after EEX running the code kkk kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk kkkkkk kkkkkxk x y 0 0x0020 x y 1 0x0080 x y 2 0x0140 x y 3 0x0280 x y 4 0x0460 x y 5 0x0700 x y 6 0x0A80 x y 7 0x0D00 x y 8 OxOEAO x y 9
14. 2 r for next iteration results yl n lt 32 n y d 0x 04hX n n y n M woronoLA Introduction to the SC140 Tools Compiler Support on StarCore 35 Compiler Support on StarCore Exercise 7 Yxkkxkxkxkkxkxkxkkkkkkkkkkkkkkkkkkkkkkkkkkkikkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkik x MOTOROLA INC x SEMICONDUCTOR PRODUCTS SECTOR x COPYRIGHT 1999 MOTOROLA INC INTRODUCTION TO THE SC140 TOOLS Developed by MOTOROLA SPS NCSG WISD xkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk kk kk short arrayl 10 1 1 1 2 2 2 2 2 3 3 short array21101 main short i short array2 ptr short tmp array2 ptr amp array2 0 for i 0 i lt 10 1i tmp arrayllil if tmp lt 0 array2 ptr tmp tmp tmp 36 Introduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore Exercise 8 Z OFFSET equ 12 M OFFSET equ 14 Exercise 9 Yxkkxkxkxkxkxxxkkxkkkkkkkkkkkkkikkkkkkkkkkkkkkikkkkkkkkkikkkkkkkkkkkkkkkkkkkkkkkkkkkk k x MOTOROLA INC x SEMICONDUCTOR PRODUCTS SECTOR x COPYRIGHT 1999 MOTOROLA INC kkkkkkkkkkkkkkkkk
15. 2w r0 d0 41 4 byte boundary move 2f r0 d0 41 4 byte boundary move 4w 0 40 41 42 43 8 byte boundary move 4f 0 40 41 42 43 8 byte boundary move l 0 40 8 byte boundary move 21 r0 d0 d1 8 byte boundary Hands On 1 Open the Ex4 c file which contains a series of assembly instructions within a C framework using asm statements For alternative and nicer ways of incorporating assembly code consult the SC700 C C Compiler User s Manual 2 Look at the assembly instructions to understand the wide data move instructions Notice that the code comprises two sections the first section with aligned data and the second with non aligned data 3 For each instruction write the result you expect from each section in the boxes provided here in the Expected Columns Array data is of type long int and therefore aligns on a 4 byte boundary data 0x01 0x231 0x45 0x67 0x89 OxAB xCD xEF OxAA OxBB OxEE OxFF 0x11 0x22 Expected Simulator move obj o Tom FL E n 1 move 2w 0 ox ell 1 J U 1 SE xl move 2t 0 2 43 s JE ooo s L TT si II T T TIT T move 2 r0 d8 d9 d8 dq HE 11 ITI First Code Section 14 Introduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore data 0x01 0x23 0x45 0x67 0x89 OxCD OxAA xDD OxEE OxFF 0x11 0x22 Expected Simulator move data 2 r0 pil
16. Fractional Value 2 A 40 bit Integer Value Fractional Value 2 33 The names of the built in intrinsics conform to the ITU ETSI basic operation functions For instance the L mac intrinsic function is used in the following example see Figure 3 and a complete list of the intrinsic functions for fractional arithmetic can be found in the SC700 C C Compiler User s Manual The example illustrates how the instructions are mapped based on the type of the arithmetic required For integer arithmetic the compiler generates integer instructions for example imac For fractional arithmetic it generates fractional instructions for example mac Also move instructions are generated with correct data alignment Integer Fractional long a long a Supported by intrinsics short b c short b c a a b c mac a b c Mm move w r0 d0 NA move f r0 d0 imac d0 d1 d2 mac d0 d1 d2 Figure 3 Integer and Fractional Compiler Support Hands On The energy of a signal x represented by Equation 1 is considered N 1 2 y Px 1 i 0 where x i is the signal input sample at iteration i y is the energy of the signal and N is the signal length 1 Open the example file Ex2 c Integer Arithmetic 2 Compile the file using ccsc100 Ot2 Ex2 c o Ex2 eld where the Ot2 option optimizes the code for time Force Parallelization 3 Run the executable using runsc100 4 Recompile the file with the S option which stops th
17. MOTOROLA INC x SEMICONDUCTOR PRODUCTS SECTOR x COPYRIGHT 1999 MOTOROLA INC x INTRODUCTION TO THE SC140 TOOLS x Developed by MOTOROLA SPS NCSG WISD xkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkikkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk Split Summation Technique Exercise include lt stdio h gt include lt prototype h gt short x 12 0 1 2 3 4 5 6 7 8 9 10 11 main short i long res1 0 res2 0 res3 0 res4 0 for i 0 1 lt 12 1 4 resl L mac resl xlil xlil res2 L mac res2 x i 1 x i 1 res3 L mac res3 x i 2 x i 2 res4 L mac res4 x i 3 x i 3 To optimise the code further break the following dependency resl resl res2 res3 res4 into resl resl res2 res3 res3 res4 resl resl res3 printf Result d 0x x n resl resl 30 Introduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore Exercise 6 Intermediate version Compromise between Memory and Speed Yxkkkxkxkkxkxkxkkkkkkkkkkkkkkkkkkkkkkkkkkkikkkkkkkkkkkikkkkkkkkkkkkkkikkkkkkkkkkkkkkik x MOTOROLA INC x SEMICONDUCTOR PRODUCTS SECTOR x COPYRIGHT
18. MOTOROLA mr 9 2001 Semiconductor Products Sector Application Note Introduction to the StarCore SC140 Contents Tools An Approach in Nine Exercises Emmanuel Roy and David Crawford Preliminaries 2 File VO Exercise 4 3 Integer and Fractional b This document presents a quick comprehensive hands on Arithmetic Exercise 5 introduction to the StarCore SC140 DSP core using programming DE Hardware Support on SUI 3 2 Compiler Support on StarCore 6 examples and exercises The goal is to help the software Local Versus Global developer start writing high level language applications in C Optimization Exercise 9 R Included are software related tips on how to get the most from the 5 Memory Alignment Exercise 12 StarCore hardware architecture 6 Split Summation Exercise 16 7 Multi Sample Exercise 18 VVe recommend that you complete the exercises in sequential 8 Control Code The True Bit order The exercises reguire the use of the SC140 C tools Exercise 21 including compiler assembler linker and simulator to generate 9 Calling an Assembly Routine executable files from C and assembly language source files and to From C Exercise 23 verify the code performance The tools are invoked from a 10 The Challenge 26 11 Solutions to Exercises
19. and this must be weighed up against the cycle count performance improvements obtained Table 3 summarizes the main characteristics of the multi sample technigue Table 3 Inner Loop Characteristics of Multi sample and Single sample Technigues Characteristic Single sample Algorithm Multi sample Algorithm Cycle count N N 4 Registers used Fewer More Sample delay 1 4 20 Introduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore Table 3 Inner Loop Characteristics of Multi sample and Single sample Techniques Continued Characteristic Single sample Algorithm Multi sample Algorithm Number of memory moves bandwidth 2N N 2 Code size Small Large Control Code The True Bit Exercise The True bit exercise shows how the compiler uses the True bit and how you can help the compiler to improve the performance The True bit is set cleared by compare or test instructions The use of the True bit as a control flag together with DSP specific code makes the SC140 very powerful for applications including both control and DSP code The True bit can affect conditional branching as well as conditional execution of groups of instructions Conditional branching includes Branch relative if True bit is True False BTD BFD Branch delayed relative if True bit is True False JI JF Jump if True bit is True False JID JFD Jump delayed if True bit is True False
20. ar product The objective of this session is to optimize the code from Ex9 c for speed and obtain the minimum number of cycles Hands On 1 Put into practice the technigues previously explained to optimize Ex9 c The original number of cycles in the inner loop is Original Inner loop So far after having modified the code your best result is Your Best Inner loop The optimized C code result is Target Inner loop 1 cycle with ALUs and AAUs 100 percent used If your best result is within 10 percent of the target result congratulations You have completed all the exercises and the challenge as well 26 Introduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore 10 Solutions to Exercises Exercise 1 xxkkxkxkxkkkxxkkkkkkkkkkkkkkkkkkkkkkkkkkikkkkkkkkkkkikkkkkkkkkkkkkkikkkkkkkkikkkkkk k x MOTOROLA INC x SEMICONDUCTOR PRODUCTS SECTOR x COPYRIGHT 1999 MOTOROLA INC x INTRODUCTION TO THE SC140 TOOLS x Developed by MOTOROLA SPS NCSG WISD xkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkikkikikkkik Hinclude lt stdio h gt main printf Wel
21. ata is aligned on the appropriate boundary Otherwise the wrong data is transferred P 0x00 AA BB CC DD EE FF AB BC 8 bytes P 0x08 01 23 45 67 89 AB CD EF 8 bytes P 0x10 Figure 7 Memory Granularity Introduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore The following instructions bring more than one byte at a time to the data register move w Rx Dn Transfer one 16 bit word from memory 2 bytes move f Rx Dn Transfer one 16 bit word from memory 2 bytes move 2w Rx Dh Transfer two 16 bit words from memory 4 bytes move 2f Rx Dh Transfer two 16 bit words from memory 4 bytes move 4w Rx Dk Transfer four 16 bit words from memory 8 bytes move 4f Rx Dk Transfer four 16 bit words from memory 8 bytes move 21 Rx Dh Transfer two 32 bit words from memory 8 bytes where x spans from 0 to 15 and the data register notations are as follows e Dn represents DO D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 D14 or D15 e Dhrepresents D D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 or D14 D15 represents D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 or D12 D13 D14 D15 Most processors reguire operands to be aligned in memory and multiple operand load stores to be aligned For example a double operand load reguires an even address and a guad operand load reguires a double even address These restrictions reduce the complexity of the
22. ay_oxX n N 2 ay_jx n N 1 y n 1 agx n 1 aix n t dyx n 1 k a x n 2 tay ox n N x3 ay_jx n N 2 5 y n 2 agx n 2 a x n 1 a x n azx n 1 ay ox n N 4 ay x n N 3 y n 3 agx n 3 ajx n 2 azx n 1 azx n dy_oX n N 5 ay x n N 4 Group 0 Group 1 Group2 Group 3 Group N 2 Group N 1 In Equation 5 the products and accumulations vvithin each group are calculated in parallel but the groups themselves are evaluated in sequence thus preserving the order of accumulation vvhich in turn preserves the bit exactness of Equation 4 Therefore parallelization is achieved by processing multiple samples in parallel rather than multiple intermediate products belonging to only one output sample VVhen one group for example Group 2 is evaluated only two words of data need to be loaded for the next group Group 3 a and x n 3 The other values needed for the calculations in Group 3 x n 2 x n 1 and x n should already exist in the DSP registers from the calculation of Group 2 The result is a reduction in memory bandvvidth requirements that increases code efficiency Introduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore Hands On 1 Open the Ex6 c file 2 Compile Exe c using the Ot 2 option Run the code and verify that the output is correct See the comments in Exe c for the correct values of y 3 Recompile Ex6 c using the Ot2 and S options
23. come to StarCore SC140 Tools n Exercise 2 BRK KR KK KR RR RK KR RR RK RK RR RK KR RR RR KK RRR KR RK RRR RR RR RR RR ke ke k k KK k k k MOTOROLA INC SEMICONDUCTOR PRODUCTS SECTOR COPYRIGHT 1999 MOTOROLA INC x INTRODUCTION TO THE SC140 TOOLS x Developed by MOTOROLA SPS NCSG NISD xkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkikkikkx include lt stdio h gt include lt prototype h gt short x 12 0 1 2 3 4 5 6 7 8 9 10 11 main short i long res 0 long fres 0 for i 0 i lt 12 1i res xlil xlil for i 0 1 lt 12 i fres L mac fres xlil xlil M MOTOROLA Introduction to the SC140 Tools 27 Compiler Support on StarCore printf The integer result is sd Ox x WMn res res printf The fractional result is 34 Ox x WMn fres fres Exercise 3 No code modification is reguired Exercise 4 move data r0 move w r0 d0 move 2w r0 d0 d1 move 2f r0 d2 d3 move 4w r0 d4 d5 d6 d7 move 2l r0 d8 d9 28 data ro 0x01 0x23 0x45 0x67 0x89 OxAB OxCD OxEF OxAA OxBB OxDD OxEE OxFF 0x11 0x22
24. e compiler after compilation 6 Introduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore 5 Open the generated assembly file Ex2 s1 and look at the integer instructions within the loop 6 In the box provided here write down the integer C code and the generated assembly instructions for the loop Notice that the first data load is automatically pipelined in the software Integer Arithmetic C code Generated Assembly code Fractional Arithmetic 7 For fractional arithmetic copy and paste the loop of Ex2 c The first loop remains unchanged and performs integer calculation while the second loop is modified to perform fractional arithmetic 8 Inthe second loop replace the integer arithmetic operation with the appropriate fractional intrinsic Remember fractional arithmetic is performed using C compiler intrinsics In this example the L mac intrinsic is used Its prototype is long int L mac long int short int short int Therefore the code modifications should be Create a new variable fres of type long int Replace res x i xlil with the instruction fres L mac fres x i x 1 Include the file prototype h which contains all the intrinsics prototypes aos P Add another printf statement to print out the fractional result The result is still a long int so 05d should still be used 9 Recompile the code with the S option and look at the generated assembl
25. e issues when developing assembly code b Split Summation Exercise The split summation exercise shows how to modify C code using the split summation technique to get better parallelization The split summation technique helps to maximize the multiple ALU loading by performing arithmetic operations in parallel while requiring little algorithmic or code modifications To illustrate this technique the example performs the the optimization of the energy of a signal calculation already considered in Exercise 2 The power calculation is represented in Equation 2 N 1 yx 2 where x i is the signal input sample at iteration 7 y is the power of the signal and N is the signal length As Exercise 2 shovvs computing the signal energy directly from Equation 2 results in the use of only one ALU out of the four with one multiply accumulate operation performed at each iteration However the split summation technigue can load all four ALUs Eguation 2 is expanded as follows N 1 y Y x i x i 4 x l x i 1 x i 2 x i 2 x i 3 x i 3 3 i 0 4 8 Equation 3 explicitly highlights the four multiply accumulate operations that can be performed in parallel Figure 9 highlights where each parallel execution is represented by Group 0 Group 1 and so on It also shovvs that the sample number i from one group to the other is incremented by four Figure 9 Signal Power Calculation Using the Split Summation Technique 16 I
26. hanges without further notice to any products herein Motorola makes no warranty representation or guarantee regarding the suitability of its products for any particular purpose nor does Motorola assume any liability arising out of the application or use of any product or circuit and specifically disclaims any and all liability including without limitation conseguential or incidental damages Typical parameters which may be provided in Motorola data sheets and or specifications can and do vary in different applications and actual performance may vary over time All operating parameters including Typicals must be validated for each customer application by customer s technical experts Motorola does not convey any license under its patent rights nor the rights of others Motorola products are not designed intended or authorized for use as components in systems intended for surgical implant into the body or other applications intended to support life or for any other application in which the failure of the Motorola product could create a situation where personal injury or death may occur Should Buyer purchase or use Motorola products for any such unintended or unauthorized application Buyer shall indemnify and hold Motorola and its officers employees subsidiaries affiliates and distributors harmless against all claims costs damages and expenses and reasonable attorney fees arising out of directly or indirectly any claim of personal injury
27. hem This transfer results in two clock cycles for every four MAC instructions In the box on the following page write the code for the intermediate version M MOTOROLA Introduction to the SC140 Tools 19 Compiler Support on StarCore C Code Generated Assembly Code Further Speed Optimization The register to register transfers can be eliminated by expanding the inner loop so that each group of four MAC instructions uses the data registers already containing the reguired data values This yields faster code but code size is greater 9 Save Ex6 1 c as Ex6 2 c 10 InEx6 2 c unroll the inner loop instructions four times so that the first four groups Group 0 Group 1 Group 2 and Group 3 are all processed in the loop This loop expansion avoids transferring data You must reduce the number of loop iterations by a factor of four to compensate for the fact that the loop is unrolled by a factor of 4 If your inner loop consumes just four cycles and your code still produces the correct output congratulations You have completed Exercise 6 Notice that each group of four MAC operations and two data load operations now reguires just one processor cycle which is half the time reguired by the filtering operation and a guarter of the time reguired by a single ALU DSP device However the code size for the inner loop has increased by a significant amount approximately four times that of the second implementation
28. iler Support on StarCore SP On function entry 4 Status Register 4 Return Address Pushed on stack by jsr bsr instruction k Parameters pushed onto stack prior to jsr b SP q PE o o R RE Prior to function call Figure 12 Stack Contents on Entry to advecs 4 In the box provided here write what you think the offsets should be Z OFFSET M OFFSET Modify the addvecs asm file to incorporate your offset values Build the code Run the code runsc100 Ex8 eld 9e 1 A The following output should be displayed ZS 3 54 75 95 ll I3 15 171 sum 80 9 If the above output is displayed your offset values are correct 10 Rebuild the code this time with the S option 11 Open the generated assembly file Ex8 1 and find the call to addvecs 12 Find the instructions that put z and M onto the stack just prior to the function call Write the offsets in the box provided here Z OFFSET M OFFSET M MOTOROLA Introduction to the SC140 Tools 25 Compiler Support on StarCore 13 Are the offsets used in Ex8 the same as the offsets used in addvecs asm If not can you explain why Congratulations you have completed Exercise 8 Good To Know The stack pointer must always be a multiple of 8 It is illegal to increment it by a non multiple of 8 9 The Challenge This section presents you with a challenge involving an example that implements a complex scal
29. imization The compiler avoids calling the function by in lining the function into the main code as shown in Ex3 main s1 Therefore it eliminates the cycle overhead associated with jumping to and returning from the function and passing the parameters to the functions Congratulations you have completed Exercise 3 Good To Know e Global optimization requires a longer compilation time than local optimization e Global optimization further optimizes the application speed 4 Memory Alignment Exercise 12 The memory alignment exercise shows the usage of wide data moves and the necessary alignments for performing these moves The SC140 memory has byte granularity as represented in Figure 7 Two arithmetic address units AAUs transfer the data from memory to the 4 ALUs and vice versa via two 64 bit data buses Each data bus allows the transfer of up to eight bytes from memory to the data registers in one cycle and vice versa If the compiler must generate the wide data move instructions available in the StarCore instruction set such as move 2w move 2f move 4w and so on data must be correctly aligned in memory This is due to the way the address and data buses operate for multi byte accesses in the StarCore architecture The compiler does not generate wide data move instructions if alignment is not guaranteed However if a function is implemented in assembly language and uses wide data move instructions you must ensure that the d
30. ion compilation flovv is represented in Figure 5 StarCore C Compiler C files C files C files c h c h c h C compiler C compiler C compiler Front End Front End Front End IR files IR files IR files obj obj obj Local Optimization Optimizer Optimizer Optimizer icode icode icode Assembler asmsc100 Assembler Assembler asmsc100 asmsc100 Obiect library files elb Figure 4 StarCore Local Optimization M MOTOROLA Introduction to the SC140 Tools 9 Compiler Support on StarCore C files C files C files c h c h c h StarCore C Compiler C Compiler C Compiler C Compiler Front End Front End Front End IR files obj IR files IR files obj j obj Global Optimization Optimizer icode Assembler asmsc100 Assembler Assembler asmsc100 asmsc100 Figure 5 StarCore Global Optimization Object Library Files elb Hands On The benefit of Global Optimization is most apparent when several files containing cross references are used as is often the case in any sizeable application In this example two files are used e the main file called Ex3 main c e a function file called Ex3 prod c The main file Ex3 main c calls a routine defined in the function file Ex3 prod c as shown in Figure 6 10 Introduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore Ex3 prod c long Prod short 111 short a211 res Prod
31. kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk INTRODUCTION TO THE SC140 TOOLS Developed by MOTOROLA SPS NCSG WISD xkkkkkkkkkkkkkkkkkkkkkkkkkkkkikkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkikikikk include lt prototype h gt define DATA LENGTH 6 Word16 y 2 Word16 a1121 10x0200 0x0400 0x0200 0x0400 0x0200 0x0400 0x0200 0x0400 0x0200 0x0400 0x0200 0x0400 Wordi6 b 12 0x0100 0x0800 0x1000 0x2000 0x1000 0x0800 0x0200 0x0100 0x1000 0x0800 0x0200 0x0100 void main Yord16 i Word32 L Rel L Re2 L Iml L Im2 L Rel L Re2 L Imi L Im2 for i 0 i lt 2 DATA LENGTH i 2 L Rel L mac L Rel alil blil L Imi L mac L Iml a i b i 1 L Im2 L mac L Im2 a i 1 blil L Re2 L mac L Re2 11411 b i 1 y 0 round L Rel L Re2 y 1 round L Iml L Im2 M MOTOROLA Introduction to the SC140 Tools 37 Compiler Support on StarCore NOTES 38 ntroduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore NOTES M MOTOROLA Introduction to the SC140 Tools 39 EOnCE is a registered trademark of Motorola Inc StarCore PowerQUICC II Motorola and the Motorola logo are trademarks of Motorola Inc The PowerPC name is a trademark of International Business Machines Corporation used by Motorola under license from International Business Machines Corporation Motorola reserves the right to make c
32. lay Welcome to StarCore SC140 Tools The runsc100 executable is a eycle accurate run time simulator It allovvs you to run an application to completion and print out intermediate final results You can use this executable for quick code verification and or debugging purposes Congratulations you have completed Exercise 1 4 Introduction to the SC140 Tools M MOTOROLA Hardvvare Support on StarCore 2 Integer and Fractional Arithmetic Exercise One of the strengths of both the StarCore architecture and the StarCore compiler is the ability to perform both fractional and integer arithmetic This exercise presents a reminder about integer and fractional arithmetic representation and then shovvs hovv to use the StarCore compiler fractional intrinsics Values stored in memory or registers are interpreted differently depending on the operation performed For integers the binary point is considered to be immediately to the right of the LSB For the fractional case the binary point is considered to be immediately to the right of the MSB Table 1 illustrates this for 16 bit data values Table 1 nterpretation of 16 bit nteger and Fractional Data Values Binary Representation Hexadecimal nteger Value Fractional value Representation decimal decimal 0100 0000 0000 0000 0x4000 16384 0 5 0001 0000 0000 0000 0x1000 4096 0 125 0000 0000 0000 0000 0x0000 0 0 0 1100 0000 0000 0000 0xC000 16384 0 5 1111 00
33. m files Complex memory configurations can be specified and detailed linker maps can be generated e Version 6 3 77 StarCore 100 Simulator and Version 1 26 Run time Simulator The StarCore 100 simulator can run from either a text based or a graphical user interface GUI A separate simulator utility runsc100 is currently included for run time I O support Before starting the exercises install the files in SC140 ex zip on your computer in the following directory e Ona Windows platform C MotorolaDSP SC140 e Ona UNIX platform MotorolaDSP SC140 The exercises directory structure and files are represented in Figure 1 This directory structure is only a recommendation any location can be used Once you have installed the exercise files and if you are running on a Windows platform all the exercises are located in c MMotorolaDSPNSC140NExercisesN This path is the reference path for all exercises discussed in this document Ex1 File I An To be created Ex2 Integer amp Fractional Arithmetic Ex3 Local vs Global Optimization Ex3 main c 8 Ex3 prod c Ex4 Memory Alignment Considerations Ex4 asm Exercises Ex5 Split Summation Technique Ex5 c Ex6 Multi Sample Technique Ex6 c Ex7 Control Code Use of the True Bit Ex7 c Ex8 Calling an Assembly Routine from C Ex8 c and AddVecs asm Ex9 The Challenge Ex9 c Figure 1 Directory Structure and Files for SC140 Exercises A typical development process i
34. ntroduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore Hands On 1 Open the Ex5 c file 2 Build the code with Ot2 then run it and notice the output result 3 Split the current implementation of the loop that is res L mac res x i xlil into four independent equations as represented in Figure 9 Independent means that the four equations are accumulated into different variables Therefore create four variables for each product Tip Watch your index increment 4 Recompile the file and run it The output result should be the same as before 5 Recompile with the S option and view the s1 file 6 Your code is optimized vvhen the loop is only one eycle and computes four operations at a time If the inner loop is egual to one cycle for four operations and the result is still correct congratulations You have completed Exercise 5 7 Inthe box provided below write the optimized inner loop code The split summation technigue allows full use of all four ALUs reducing the cycle time by more than 70 percent relative to use of a single ALU The 4 ALU technigue does not guarantee bit exactness with the single ALU technique because the order of accumulation is different Using the 4 ALU technique therefore has implications in applications that are defined by bit exact standards such as speech coding standards from ITU ETSLTIA EIA and so on M MOTOROLA Introduction to the SC140 Tools 17 Compiler Suppo
35. or death associated with such unintended or unauthorized use even if such claim alleges that Motorola was negligent regarding the design or manufacture of the part Motorola and M are registered trademarks of Motorola Inc Motorola Inc is an Egual Opportunity Affirmative Action Employer MOTOROLA and the Stylized M Logo are registerd in the US Patent amp Trademark Office OnCE DigitalDNA and the DigitaiDNA LOGO are trademarks owned by Motorola Inc All other products or service names are the property of their respective owners Motorola Inc 2001 How to reach us USA EUROPE Locations Not Listed Motorola Literature Distribution P O Box 5405 Denver Colorado 80217 1 303 675 2140 or 1 800 441 2447 JAPAN Motorola Japan Ltd SPS Technical Information Center 3 20 1 Minami Azabu Minato ku Tokyo 106 8573 Japan 81 3 3440 3569 ASIA PACIFIC Motorola Semiconductors H K Ltd Silicon Harbour Centre 2 Dai King Street Tai Po Industrial Estate Tai Po N T Hong Kong 852 26668334 Technical Information Center 1 800 521 6274 HOME PAGE htip www motorola com semiconductors M MOTOROLA AN2009 D
36. r0 x ptr x ptr x ptr x ptr var3 var3 var3 var3 x n 3 x n 2 x n 1 xinl x ptr nov points to x n 1 for i 0 i 12 i 4 res0 resi res2 res3 var3 res0 resi res2 res3 var2 res0 resi res2 res3 var res0 resi res2 res3 var0 L mac res L mac res L mac res L mac res x ptr L mac res L mac res L mac res L mac res x ptr L mac res L mac res L mac res L mac res x ptr L mac res L mac res L mac res L mac res x ptr 0 ali 1 alil 2 alil 3 ali var3 a i 1 1 al i 1 2 a i 1 3 a i 1 var2 0 ali 2 1 a i 2 2 ali 21 3 a i 2 vari al i 3 1 a li 3 2 al i 3 3 a i 3 var0 varo i varl 2 var3 x n i 1 var3 Var0 varl vara x n i 2 vara var3 varo varl x n i 3 x vari i var2 var3 x n i 4 x Var0 Truncate results and store in y y n y n 1 y n 2 y n 3 x ptr 20 Increment pointer by 20 to point to x n 7 Print for n 0 printf extract h res0 extract h resl extract h extract h res3 res
37. rt on StarCore 6 18 Good To Knovv The use of four variables removes the accumulation dependency that is required for parallelism e Bit exact considerations must be understood if this technique is used overflow saturation characteristics may change during split summation Multi Sample Exercise The multi sample exercise demonstrates the multisample technique As the exercise in Section 5 shows the split summation technique allows a sum of products operation to be calculated using all four ALUs by evaluating four intermediate products at a time However it does not guarantee bit exact agreement with serially accumulating each intermediate product using a single ALU To ensure bit exactness the order of summation must be preserved by performing each intermediate product accumulation in turn Therefore the intermediate products cannot be evaluated in parallel Furthermore the split summation technique may not be suited for the application Other techniques can be used where it is possible to evaluate one intermediate product from each of four output sample calculations in parallel Consider the FIR filtering operation described by Equation 4 N 1 y n Y ajx n i forO lt n lt L 4 i 0 A C code implementation of this operation typically resembles the implementation of Exe c To use all four ALUs the operations can be grouped as illustrated in the following equation y n agx n ajx n 1 ayx n 2 azx n 3
38. s represented in Figure 2 Introduction to the SC140 Tools M MOTOROLA C files C CCSC100 C Compiler Front End IR files obj Optimizer icode IR Intermediate Representation Assembly files S Assembler asmsc100 Assembly files asm Obiect files eln p sc100 ld IR library files lib E o o a E o o a lt Object library files elb o x Y Map files map Run time Simulator runsc100 Linker Absolute files Execute Program to completion el C file I O capability M MOTOROLA Listing files Ist Interactive Simulator simsc100 DOS based Figure 2 StarCore Development Process Introduction to the SC140 Tools 1 File I O Exercise The file VO exercise shows how to use standard ANSI C VO features within the current tools suite Hands On 1 Create a new text file called io c 2 Within the io c file write code using the ANSI C printf function to display Welcome to StarCore SC140 Tools on the screen remember to include the header file stdio h 3 Compile the file using ccsc100 io c o io eld The o option specifies the output file name for example io e1d If the application does not compile successfully correct the reported mistake s and recompile the application until a successful compilation occurs 4 Run the executable runsc100 io eld to disp
39. sults and store in y y n extract h res0 y n 1 extract h res1 y n 2 extract h res2 y n 3 extract h res3 X ptr 20 Increment pointer by 20 to point to x n 7 for next iteration Print results y for n 0 n 32 n printf y d 0x 04hxX n n yln Further Optimizing the Speed xxkkkkxkkkxkxkkkkkkkkkkkkkkkkikkkkkkkkkkikkkkkkkkkkkkikkkkkkkkkkkkkikkkkkkkkkkikkkkk k x MOTOROLA INC x SEMICONDUCTOR PRODUCTS SECTOR x COPYRIGHT 1999 MOTOROLA INC x INTRODUCTION TO THE SC140 TOOLS x Developed by MOTOROLA SPS NCSG NISD xkkkkkkkkkkkkkkkkkkkkkkkkkkkikkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkikkkkkkkkkkkk kk kk Multi sample technique Exercise on an FIR Filter include lt stdio h gt include lt prototype h gt short a1121 10x1000 0x2000 0x3000 0x4000 0x5000 0x6000 0x7000 0x8000 0x9000 0xA000 0xB000 0xC000 short input 32 11 0 0 0 0 0 0 0 0 0 0 0 zero padding 0x0100 0x0200 0x0300 0x0400 0x0500 0x0600 0x0700 0x0800 0x0900 0x0A00 0x0B00 0x0C00 0x0D00 0x0E00 0x0F00 0x1000 0x1100 0x1200 0x1300 0x1400 0x1500 0x1600 0x1700 0x1800 0x190
40. the calling function deallocates the stack space used to pass parameters 3 4 5 and so on M MOTOROLA Introduction to the SC140 Tools 23 Compiler Support on StarCore High Address SP Local Variables if any C SP Saved urrent Registers SP SP Return Return Return Address Address Address SP Parameters Parameters Parameters Parameters 1 4 5 4 5 3 4 5 4 5 SP SP 1 2 3 1 l A 1 Vo a 1 Low Address Prior to function call 2 On entry to function 3 During function execution TA Prior to exit from function T On return from function Galling function deallocates parameters on stack Figure 11 Typical Stack Contents During Function Execution Therefore for the function addvecs parameters x and y are passed in TO and r1 while z and M are passed on the stack Hands On 1 Open the Ex8 c and addvecs asm files and familiarize yourself with the code 2 Inaddvecs asmare two constants Z OFFSET and M OFFSET whose values are not set and which are represented by question marks These offsets pull z and M from the stack Find the lines of code that perform this task 3 Before the code can be built you must assign values to Z OFFSET and M OFFSET To help you to do this Figure 12 shows the stack on entry to addvecs 24 Introduction to the SC140 Tools M MOTOROLA Comp
41. y file Ex2 s1 within the second loop 10 In the box provided below write the fractional C code and the generated assembly instructions for that loop Fractional Arithmetic C code Generated Assembly code M MOTOROLA Introduction to the SC140 Tools 7 Compiler Support on StarCore 11 Compare the fractional assembly instructions generated to the assembly integer instructions 12 Recompile the code without the S option to produce an executable file 13 Run the code using runsc100 The variables res and fres should print to the screen What is the algebraic relationship betvveen these tvvo variables Congratulations you have completed Exercise 2 8 Introduction to the SC140 Tools M MOTOROLA Compiler Support on StarCore Local Versus Global Optimization Exercise The local versus global optimization exercise shovvs the difference betvveen tvvo C compiler options local optimization the default and global optimization Local optimization compiles each file of the profect individually as represented in Figure 4 Global optimization acts as a global binder that links all the intermediate representation IR files into one file before optimizing the application Since all the application code information is available this approach enables further optimizations beyond those achieved using local optimization alone Compilation takes longer vvhen global optimization is enabled Global optimizat

Motorola SC140 Network Card User Manual

Contents

Download Pdf Manuals

Related Search

Related Contents