Home

TMS320C31 Embedded Control Technical Brief

image

Contents

1. Compatible Devices 1 3 Compatible Devices The TMS320C31 is one of two members of the TMS320C3x generation of DSPs The other member is the TMS320C30 which is object code compatible with the C31 The C30 is identical to the C31 except that it has 4K words of ROM two serial ports and a second external bus For more information on the TMS320C30 refer to the TMS320C3x User s Guide literature number SPRUO31 Figure 1 2 is a block diagram of the TMS320C3x devices The shaded areas highlight the features that apply only to the C30 Figure 1 2 TMS320C3x Block Diagram RAM Block 0 RAM Block 1 ROM Block 0 1K x 32 1K x 32 4K x 32 Data Buses CPU DMA Program Cache 64 x 32 Integer Integer Address Generators Floating Point Floating Point Multiplier ALU Control Registers 8 Extended Precision Registers Peripheral Bus 3 3 L o o x Controller Address Address X2 CLKIN Generator 0 Generator 1 VDD ES 8 Auxiliary Registers gt 12 Control Registers Available on TMS320C30 TMS320C30 27 and TMS320C30 40 In addition the C30 and C31 are both source code compatible with the TMS320C4x which is the first DSP designed specifically for parallel proces sing For more information on the C4x refer to the TMS320C4x Technical Brief literature number SPRUO76 al Introduction 1
2. The Texas Instruments low cost high performance TMS320C31 has defined a new role for digital signal processors in embedded systems Well suited for general purpose use the TMS320C31 is finding widespread acceptance as an embedded controller in applications such as Industrial automation Telecommunications Motor control Automotive Instrumentation Laser printers Scanners Voice mail DL O O L L L L L and is expanding the role of DSPs from math support to embedded control The topics covered in this chapter include Topic Page 1 1 Embedded Controller Requirements ooocccocococococnccncn 1 2 2M Ss 20G3liKeyaReatuneSmrrrmer acer tere cee reenter tr tr 1 3 13 Compatible Devices lia ll 1 5 1 4 TMS320C31 Development Support s esee 1 6 1 5 Benefits of a TMS320C31 Based Embedded System 1 8 Embedded Controller Requirements 1 1 Embedded Controller Requirements An embedded controller is a dedicated processor used in systems or subsys tems such as laser printers voice mail systems and bar code readers to con trol a specific set of functions Unlike a PC or workstation host CPU an em bedded controller is not accessible to the system user to run different software packages or to be reprogrammed but is used to cost effectively control a pre determined set of functions To make these types of systems successful em bedded controllers must possess the following character
3. Te TEXAS rro TMS320C31 Embedded Control Technical Brief Y M d N eaa M A TEXAS INSTRUMENTS TMS320C31 Embedded Control Technical Brief February 1998 3 TEXAS INSTRUMENTS IMPORTANT NOTICE Texas Instruments Incorporated TI reserves the right to make changes to its products or to discontinue any semiconductor product or service without notice and advises its customers to obtain the latest version of relevant information to verify before placing orders that the information being relied on is current TI warrants performance of its semiconductor products and related software to current specifications in accordance with Tl s standard warranty Testing and other quality control techniques are utilizedto the extent Tl deems necessary to supportthis warranty Specific testing of all parameters of each device is not necessarily performed except those mandated by government requirements Please be aware that TI products are not intended for use in life support appliances devices or systems Use of TI product in such applications requires the written approval of the appropriate TI officer Certain applications using semiconductor devices may involve potential risks of personal injury property damage or loss of life In order to minimize these risks adequate design and operating safeguards should be provided by the customer to minimize inherent or procedural hazards Inclusion of TI products in
4. f E TD z Test a L Specimen A O E Current IR and Power factor cos 0 SDREMSIGE E Test Voltage Cp Equiv Parallel Capacitor Rp Equiv Parallel Resistor Doble Engineering upgraded their M series test system from an all analog de sign to an all digital design to reduce production cost provide portability in crease accuracy and provide expert advice to the operator Elegantly simple the new system consists of an IBM compatible PC AT with an attached DSP The DSP replaces the analog signal processing hardware and executes pro prietary signal processing algorithms which produce more accurate measure ments The PC host serves as an expert system and provides a graphical user interface GUI complete with dials and meters for operator ease Application Examples 4 9 Test Equipment Example Using SPOX Figure 4 4 The New Doble M Series System Operator PC Realtime D Data Control 4 3 1 TMS320C30 and SPOX Merging DSP and Control 4 10 By building an experimental DSP based system using a fixed point DSP Doble began the transition from analog to digital technology Because the DSP lacked many general purpose functions and was difficult to program they used it as a black box to replace the analog circuitry that performed filtering and modulation The DSP
5. oooocccccocncnn nnn nnn B 1 BA Part Numbers 422a ded dace ead a d acd ata B 2 B 2 Device and Development Support Tool Prefix Designators Luususue B 4 B 3 Device SuffiXes c susci aus eeu eer legi ep Ra eds ped e Pd waded ede B 5 Contents xi Figures o dp g N o A de das da dde m amp W P al xii TMS320C31 Performance 0 0 eet enine eee hh 1 4 TMS320C3x Block Diagram 00 0 eet eee eens 1 5 TMS320C3x Development Environment 0 0 0 c cece eee eee eee 1 7 Benefits of Replacing a Controller Coprocessor With a TMS320C31 Based Embedded System sis s 14 ciertos err ye e he a dado 1 9 TMS320C531 Block Diagram ssssssssssseses eee 2 3 Central Processing Unit CPU sssuuusssssesssssssesess n 2 5 Memory Organization sirridir nci rese lad aser dare dade a a eae ed caa 2 21 TMS320C31 Memory Maps dadami cece er 2 23 Peripheral Modules oooccccccccccccccc es rs 2 25 DMA Controller ooooocccccccocnc RI Rm hs 2 27 VPRO 4 Hardware Architecture 000 eee eens 4 3 System Diagram ussssssslllslsses sehn 4 6 Doble Test Set Up cesi ii a Ren A Reade da s 4 9 The New Doble M Series System sssuuesssssesssss sees 4 10 Data Flow Optimizations for TMS320C31 Compilers oooooccocccccocccn oo 5 6 Copy Propagation and Control Flow Simplification for TMS320C31 Compilers 5 8 In Line Function Ex
6. portion of an object file to work with systems that have code in ROM The de bugger can execute commands from a batch file providing an easy method for entering often used command sequences Key features include Multilevel debugging The debugger allows you to debug both C and as sembly language code While debugging a C program you can choose to view the C source the disassembly of the object code created from the C source or both Fully configurable state of the art window oriented interface The debugger separates code data and commands into manageable in formation You can select from several displays Or since the debugger s display is completely configurable you can create the interface that best suits the application The display s colors physical appearance of dis played features such as window borders and window size and position can be changed Flexible command entry Commands can be entered by using a mouse the function keys or the pull down menus The debugger s command his tory can be used to re enter commands On screen editing Any data value displayed in any window can easily be changed by pointing with the mouse atthe value clicking and enter ing the correct value L Continuous update The debugger continuously updates information on the screen highlighting changed values Comprehensive data display You can easily create windows for display ing and editing the values of v
7. 16K word zero wait state SRAM allowing coding of most algorithms di rectly on the board Analog interface for embedded systems development An external serial port interface that can be used for connecting multiple EVMs or for extra analog interfacing A host port for PC communications Embedded emulation support via the 74ACT8990 test bus controller The system also comes with all of the software required to begin application development on a PC host m The window oriented mouse driven interface supports downloading executing and debugging of assembly code or C code including modifi cation display of memory and registers software single step and break point capabilities The TMS320C3x assembler linker is also included with the EVM For high level language programming the optimizing ANSI C and the Ada compilers are offered separately The TMS320C3x EVM is supported on PC AT MS DOS version 3 00 or high er platforms Development Support 5 25 TMS320C3x Emulator 5 6 TMS320C3x Emulator 5 26 The TMS320 Extended Development Systems XDSs are powerful full speed emulators used for system level integration and debug TI developed the world s first in system scan based emulator XDS for TMS320C3x pro cessors Scan based emulation is a unique nonintrusive approach to system emula tion integration and debug This approach was conceived and developed by TI to address hardware software characteristics reduced
8. 2 18 ACTOS INTI Operation Parallel Arithmetic With Store Instructions STF src3 dst2 STI src3 dst2 ADDF3 Add floating point Src1 src2 dst STF src3 dst2 Src3 dst2 AND3 Bitwise logical AND src1 AND src2 gt dst1 iS peewee E ASH3 Arithmetic shift If count 0 STI src2 lt lt count dst1 src3 dst2 Else src2 gt gt count dst1 src3 dst2 STI src3 gt dst2 ai Src3 dst2 LDI Load integer src2 gt dst1 STI src3 dst2 LSH3 Logical shift If count gt 0 STI Src2 lt lt count gt dsti src3 dst2 Else src2 gt gt count dst1 src3 dst2 MPYF3 Multiply floating point Src1 x src2 gt dst1 STF src3 dst2 MPYI3 Multiply integer Src1 x src2 gt dst1 NEGF Negate floating point 0 src2 dst1 LDF Load floating point src2 dst1 STF src3 dst2 mnemonic Desorption Parallel Arithmetic With Store Instr NEGI Negate integer STI NOT Complement STI OR3 Bitwise logical OR STI Subtract floating point Subtract integer Bitwise exclusive OR SUBF3 STF SUBI3 STI XOR3 STI Load integer MPYF3 Multiply and add floating point ADDF3 MPYF3 Multiply and subtract floating point SUBF3 MPYI3 Multiply and add integer ADDI3 MPYI3 Multiply and subtract integer SUBI3 Parallel Store Instruct STF Store floating point STF STI Store integer STI LEGEND Src
9. 5j 3 R1 load shift count R1 RC RS C j lt lt a j lt lt 3 RE RC R2 R2 j lt lt b RS R2 R3 R3 J lt lt b J lt lt a R3 push R3 d RS push c RE push b R1 push a tracked in R1 call The constant 3 assigned to a is copy propagated into all uses of a a becomes a dead variable and is removed completely The sum of multiplying j by 3 a and 2 is simplified into a multiply by 5 which is computed with a shift and add The expression j a is computed once for assignment to c and then reused for calculating d These optimizations are also performed across jumps Branch Optimizations Control Flow Simplification The compiler analyzes the branching behavior of a program and rearranges the linear sequences of operations basic blocks to remove branches or re dundant conditions Unreachable code is deleted branches to branches are bypassed and conditional branches over unconditional branches are simpli fied to a single conditional branch When the value of a condition can be deter mined at compile time through copy propagation or other data flow analysis a conditional branch can be deleted Switch case lists are analyzed in the 5 6 TMS320C3x Optimizing ANSI C Compilers same way as conditional branches and are sometimes eliminated entirely Some simple control flow constructs can be reduced to conditional instruc tions totally eliminating the need f
10. Precise Software Technologies Inc Concurrent Program Development Precise MPX provides over 90 primitives to support program develop ment These can be grouped into the following major categories Task management Inter task communication Interrupt management Memory management Server management A software designer uses the Precise MPX tasking model interrupt man agement primitives and inter task communications primitives to solve a realtime problem by breaking it down into concurrent tasks that communi cate via well defined messages A task is simply a C language function im plemented as an iterative loop Inter task communication primitives pass messages between tasks and implicitly provide concurrency which sim plifies realtime design and implementation B Task Management Precise MPX has the capability to completely manage the state of tasks while an application is executing This capability is especially im portant for realtime applications that require recovery reconfigura tion or have resource limitations Application tasks are defined to the Precise MPX kernel through a data structure which specifies priority stack size and the symbolic name of the first function of the task All application tasks except for main tasks are managed explicitly by the application using the Create and _Destroy task management primitives Tasks are very lightweight A task context is maintained in a 128 byte task descripto
11. Shifted Dreg right by count Dreg ASH3 Arithmetic shift 8 operand If count 2 0 Shifted src left by count gt Dreg Else Shifted src right by count Dreg SH Convert floating point value to integer Fix src gt Dreg FLOAT Convert integer to floating point value Float src Rn SH i i L Logical shift If count gt 0 Dreg left shifted by count gt Dreg Else Dreg right shifted by count gt Dreg LSH3 Logical shift 3 operand If count gt 0 src left shifted by count gt Dreg Else src right shifted by count Dreg NORM Normalize floating point value Normalize src Rn Round floating point value Round src Rn Rotate left Dreg rotated left 1 bit gt Dreg Central Processing Unit CPU Table 2 7 Arithmetic Instruction Summary Concludea Cei Desevipion Operation ROLC Rotate left through carry Dreg rotated left 1 bit through carry Dreg ROR Rotate right Dreg rotated right 1 bit gt Dreg RORC Rotate right through carry Dreg rotated right 1 bit through carry gt Dreg SUBB Subtract integers with borrow Dreg src C Dreg SUBB3 Subtract integers with borrow 3 oper src1 src2 C gt Dreg and SUBC Subtract integers conditionally If Dreg src 2 0 Dreg src lt lt 1 OR 1 gt Dreg Else Dreg lt lt 1 Dreg TMS320C31 Architectural Overview 2 17 Central Processing Unit CPU Table 2 8 Parallel Instruction Set Summary
12. vides a number of features including E Elegant user interface The TI code profiler shares the same fully configurable window oriented and mouse driven interface as the TI C source debugger so learning to profile is quick and easy E Multilevel profiling An assembly window and a C window are dis played so you can profile C code assembly code or both simulta neously E Powerful command set A rich set of commands is available to select and manipulate profile areas on the global module function and ex plicit levels so you can efficiently profile even the most complex ap plications Development Support 5 17 TMS320 Programmer s Interface C Assembly Source Debugger 5 18 B Comprehensive statistics The profiler provides all the information you need to identify bottlenecks in your code The number of times each area was entered during the profile session m The total execution time of an area including or excluding the execution time of any subroutines called from within that area m The maximum time for one iteration of an area including or ex cluding the execution time of any subroutines called from within that area Ml Versatile display The ability to choose profile areas the type of sta tistical data and sorting criteria ensures an efficient customized dis play of the statistics The data can also be accompanied by histo grams to show the statistical relationship between profile areas Bl Disabled
13. 5 8 HP 64776 Analysis Subsystem eeeeeeeeeee 5 31 5 9 ETMS320iTechnicaliSupport E e 5 33 5 1 TMS320C3x Optimizing ANSI C Compilers 5 1 TMS320C3x Optimizing ANSI C Compilers 5 2 Fast code development and code maintenance over the life of a product are concerns that all developers share TI supports embedded system developers with an optimizing compiler for the TMS320C31 which translates ANSI stan dard C language files into highly efficient TMS320C31 assembly language source files which are then input to a TMS320C31 assembler linker The com piler has been validated for conformance to the ANSI C specification using the industry standard Plum Hall test suite The TMS320C31 compiler is complemented by the standard TMS320 Pro grammers Interface for debugging C and assembly source code The C com piler produces a rich set of debugging information which is used by the debug ger allowing source level debugging in C This enhances productivity and shortens the development cycle for embedded system designers Key features include L Complete and exact conformance with the ANSI C specification Highly efficient code The compiler incorporates state of the art generic and target specific optimizations described in detail within the succeed ing subsections The TMS320C31 compiler performs both global opti mizations and loop optimizations such as strength reduction Additionally it thoroughly a
14. DIP PLCC memew pe psc a 144 ASK 4K L 285 DIP PLCC Mer E CN CI ca DE IEA E EU sacar 7 aw gt gt 20 oros TMS320E15t 256 4K 4K 8x16 200 DIP CER QUAD TMS320E15 25 256 4K 4K 8x16 160 DIP CER QUAD mes 6 D ee one Lene mao jej rea one rrr E pee mo rore BER Dia EA EA 4 EREE QUAD E AA E E ELSEN ES EM Tt military version available planned contact nearest TI Field Sales Office for availability t Ser serial Par parallel DMA direct memory access Int internal Ext external Com parallel communication ports dewpeoy JOnpold OZESWL 9u1 ueg dS OZESWL LY Table A 1 TMS320 Family Overview Concluded io On Chip Cycle Package Com Timer Type sur m o A i eem mee mo o Pm e a sce 1 ede amp a coro Fees 1 iets e 1 00 rrcoror Fees 1 166 e 99 CERGUAD ere 1 ede 8 woo prec Fees 1 eae ex 1 9 Por Fere 2 sacas ex E Fare mss zx ek re 2 ECT ex 1 E Fore mss a sc sess 2 eweaet ex eso Pare Floating TvSszocsot ak 10 2 eua men el 60 PGR Point mMSseocsozr ak aK 180 2 freme ven 20m 74 PGA wo mss ak aK tom S e rena mre 26 50 PGA size TmMSsaocsT PAS rem mies 28 60_ POrP mwera pa s 39 tomar mer aer 7 rar Erssacsrao z
15. NU Stop History Saving NU Relinquish NU Stop Performance Timer NU Sleep NU Stop NU Reset NU Retrieve Task3 NU Current Task ID Clock Management Fixed Size Memory Management NU Set Time NU Alloc Partition NU Read Timer NU Available Partitions NU Dealloc Partition Management Variable Size Memory Management NU Send Item NU Alloc Memory NU Force ltem In Front NU Available Memory NU Retrieve Item NU Dealloc Memory NU Retrieve Item Mult NU Retrieve Queue Status Management Resource Management NU Set Events NU Request Resource NU Wait For Events NU Retrieve Resource Status NU Release Resource 6 4 A T Barrett amp Associates Inc 6 2 A T Barrett amp Associates Inc 11501 Chimney Rock Houston Texas 77035 800 525 4302 713 728 9688 FAX 713 728 1049 RTXC Realtime Kernel for single processor systems RTXC MP Realtime Kernel for multiple processor systems RTXC and RTXC MP are fully preemptive priority driven realtime kernels written in ANSI C that enable you to tap the full power of the TMS320C3x processors in realtime environments Released in 1985 RTXC has been continuously upgraded Demonstration and benchmark disks on RTXC and RTXC MP are avail able free of charge An evaluation package containing a full kernel a spe cial user s manual and special utilities to assist in evaluation of the kernel is also available The package
16. TMX Experimental Device H 0to 50 C TMP Prototype Device L Oto 70 C TMS Qualified Device S 55 to 100 C SMJ MIL STD 883C M 55 to 125 C A 40 to 85 C Device Family Package Type 320 TMS320 Family N Plastic DIP JD Ceramic DIP Side Brazed Technology FN Plastic Leaded CC C CMOS GB Ceramic PGA E CMOSEPROM FJ Ceramic Leaded CC LC Low power CMOS FD Leadless Ceramic CC P One time programmable FZ Ceramic Leaded CC GE Ceramic PGA Glass Seal Device HU Ceramic quad flatpack 1st generation DSP HT Ceramic quad flatpack 10 gull wing 14 PQ Plastic quad flatpack 15 16 17 2nd generation DSP 20 25 26 28 3rd generation DSP 30 31 4th generation DSP 40 5th generation DSP 50 51 53 Part Ordering Information B 5 B 6 A T Barrett 8 Associates Inc F 5 F 8 Accelerated Technology Inc F 2 F 4 addressing modes 5 11 conditional branch 2 9 long immediate 2 9 parallel 2 9 three operand 2 9 algebraic reordering 5 4 analysis subsystem 5 31 analyzer HP 64776 5 31 logic 5 32 systematic 5 31 ANSI C compiler 5 2 application examples 4 1 4 12 application s software 5 35 5 41 architectural overview TMS320C31 2 1 2 32 archiver 5 20 arithmetic instruction set summary 2 16 2 17 arithmetic logic unit ALU 2 8 assembler TMS320 5 19 assembler linker Loughborough Sound Images Ltd F 17 assemblers Loughborough Sound Images Ltd F 17 T
17. To perform these operations effi ciently Nicolet wanted a slave processor that would allow low system cost and high data movement and numeric processing performance using the C language Nicolet selected the C31 due to its balance of price and perfor mance over RISC solutions In addition for extremely cost sensitive designs Nicolet is considering the C31 to integrate the functionality of both the master and slave processors Application Examples 4 5 Instrumentation Application and Processor Evaluation Example Figure 4 2 System Diagram Device Under Test Acquisition Memory Master Coprocessor CPU CPU Coprocessor Shared Memory For Nicolet s data acquisition equipment the processor must move data and complete calculations in realtime and also have enough performance to dis play the information in a reasonable amount of time To fulfill these require ments the slave processor needed the following characteristics L High data movement rate Fast address generation capability Realtime calculation of waveform pulse parameters O O L Floating point Fast Fourier transformation FFT of input samples to en able the frequency domain display of the data _ Performance of other realtime DSP operations including filtering correla tion and convolution The importance of these device characteristics is illustrated in some of the al gorithms Nicolet uses in its data acquisition equipment
18. archive shuffle wa veform processing and FFT Instrumentation Application and Processor Evaluation Example 4 2 2 Archive Shuffle When Nicolet s equipment digitizes a waveform the trigger or start point is not necessarily at the first location in digitizer memory The archive shuffle algo rithm moves the trigger point to the first location without using additional data memory in place data movement Even though the archive shuffle algorithm did not take advantage of a DMA controller the C31 is efficient at performing the data shuffle due to its single cycle instructions and auxiliary register arith metic units which can generate two pointer addresses every instruction cycle Nicolet modified the algorithm to use the C31 s on chip DMA to move blocks of data in and out of the C31 in parallel with the C31 CPU calculating the source and destination addresses of subsequent blocks With the use of the DMA Nicolet estimated that the time required to shuffle a block of data was reduced to 35 of the time required for the non DMA implementation 4 2 3 Waveform Processing Waveform processing involves calculating waveform parameters such as area rise time root mean square RMS and standard deviation The wave form processing must be performed on 1K samples fast enough to allow 5 10 user screen updates second The C31 provided more than enough perfor mance to meet the screen update requirements With its single cycle multiply capabil
19. cycle time package type technology and availability Many features are common among these TMS320 processors When the term TMS320 is used it refers to all five generations of DSP devices When refer ring to a specific member of the TMS320 family e g TMS320C15 the name also implies enhanced speed in MHz 14 25 etc erasable programmable TMS320E15 low power TMS320LC15 and one time programmable TMS320P15 versions Specific features are added to each processor to pro vide different cost performance alternatives Software compatibility is main tained throughout the family to protect your investment Each processor has code generation system integration and debug tools to facilitate the design process Figure A 1 TMS320 Device Evolution MOZ gt ZNOTIMU Qouormas ou z LITE TMS320C10 TMS320C10 14 25 TMS320C14 TMS320E14 P14 TMS320C15 LC15 TMS320E15 P15 TMS320C15 25 TMS320E15 25 TMS320C16 TMS320C17 LC17 TMS320E17 P17 Fixed Point Generations N Floating Point Generations TMS320C30 TMS320C30 27 TMS320C30 40 TMS320C31 TMS320C31 27 TMS320C31 40 TMS320C25 TMS320E25 TMS320C25 33 TMS320C25 50 TMS320C26 TMS320C28 YZI TMS320C50 TMS320C51 TMS320C53 GENERATION TMS320C40 TMS320C40 40 The TMS320 Product Roadmap TMS320 DSP Family A 5 Table A 1 TMS320 Family Overview Type Com Timer Type RAW RON
20. with in line code saving the overhead associated with a function call as well as providing increased opportunities to apply other optimizations See Figure 5 2 and Figure 5 3 Development Support 5 7 TMS320C3x Optimizing ANSI C Compilers Figure 5 2 Copy Propagation and Control Flow Simplification for TMS320C31 Compilers fsm enum ALPHA BETA GAMMA OMEGA state ALPHA int input while state OMEGA switch state case ALPHA state BETA GAMMA break Case BETA state GAMMA ALPHA break case GAMMA state GAMMA OMEGA break TMS320C31 compiler output is fsm ARA is allocated to user var input AR44 RO initial state ALPHA L4 if input goto state B L12 H else goto state GAMMA AR44 state ALPHA L12 if input 0 goto state GAMMA AR44 State BETA L9 E if input 0 goto state ALPHA AR44 state GAMMA EPIO 1 if input goto state OMEGA AR44 state GA L12 if input goto state GAMMA state OMEG The switch statement and the state variable from this simple finite state machine process are optimized completely away leaving a streamlined series of conditional branches 5 8 TMS320C3x Optimizing ANSI C Compilers Figure 5 3 In Line Function Expansion for TMS320C31 Compilers inline blkcpy if n gt 0 do tor char to from struct s
21. 0 If the operands are either signed or unsigned integers only bits 31 0 are used bits 39 32 remain unchanged Bits 39 32 remain unchanged for all shift operations The 32 bit auxiliary registers AR7 ARO can be accessed by the CPU and modified by the two Auxiliary Register Arithmetic Units ARAUS The primary function of the auxiliary registers is the generation of 24 bit addresses They can also be used as loop counters or as 32 bit general purpose registers that can be modified by the multiplier and ALU The data page pointer DP is a 32 bit register The eight LSBs of the data page pointer are used by the direct addressing mode as a pointer to the page of data being addressed Data pages are 64K words long with a total of 256 pages The 32 bit index registers IRO IR1 contain the value used by the Auxiliary Register Arithmetic Unit ARAU to compute an indexed address The ARAU uses the 32 bit block size register BK in circular addressing to specify the data block size The system stack pointer SP is a 32 bit register that contains the address ofthetop ofthe system stack The SP always points to the last element pushed onto the stack A push performs a preincrement and a pop performs a post decrement of the system stack pointer The SP is manipulated by interrupts traps calls returns and the PUSH and POP instructions The status register ST contains global information relating to the state of the CPU Typically
22. Ada compilation systems can be hosted on either the Digital Equipment Corporation VAX series equipment running the VMS operating system version 5 2 or later or on the Sun SPARC platforms running the SunOS operations system version 4 1 1 or later Available options include an interface to Spectron s SPOX DSP vector matrix and filter math functions TI simulator facilities for customizing the runtimes and the AdaScope retargeting kit to adapt to a different hard ware configuration or communications protocol Ada Compiler for the TMS320C30 Tartan s Ada compiler for the SMJ320C30 the military version of the TMS320C30 supports VAX VMS and Sun s SPARC systems The com pilerimplements Ada as defined in ANSI MIL STD 1815A 1983 andis val idated under the latest DOD ACVC test suite 1 11 Tartan Inc Tartan Ada C30 targeted compilation systems produce highly optimized application code that runs on the TMS320C30 processors The compila tion system consists of EW Full function optimizing Ada compiler Tartan Ada Library that implements the Ada language requirements for separate compilation and dependency control Tartan Ada Runtime System including precompiled standard Ada packages for I O and other facilities and precompiled C30 specific packages Tartan cross reference facility TXREF Tartan Ada Runtime Client Package ART Client allowing on site cus tomizing of the runtime BI Library of elementary math and trigonometric
23. Branch Optimizations walt volatile int p for if p 0x80 p OxF0 TMS320C31 compiler output is _wait L6 RO p ARA is allocated to p test p amp 0x80 false loop back true loop back delayed RO RO WOW branch occurs The unconditional branch at the bottom of this loop is written as a delayed branch allowing it to execute in one machine cycle Use of Registers for Passing Function Arguments The compiler supports a new optional calling sequence that passes argu ments to registers rather than pushing them onto the stack This can result in significant improvement in performance especially if calls are important in the application See Figure 5 2 Parallel Instructions Several floating point or integer instructions such as load load store operate and multiply add can be paired with each other and executed in parallel When adjacent instructions match the addressing requirements the compiler com bines them in parallel Although the code generator performs this optimization the optimizer greatly increases effectiveness because operands are more like ly to be in registers See Figure 5 3 and Figure 5 5 Conditional Instructions The load instructions inthe C31 C compiler can be executed conditionally For simple assignments such as a condition expr1 expr2 or if condition a b the compiler can use conditional loads to avoid costly branches Devel
24. Compilers operations that produce values already computed The compiler performs these data flow optimizations both locally within basic blocks and globally across entire functions See Figure 5 1 and Figure 5 2 Copy Propagation Following an assignmentto a variable the compiler replaces references to the variable with its value The value could be another variable a constant or a common subexpression This may result in increased opportunities for constant folding common subexpression elimination or even total elimination of the variable Common Subexpression Elimination When the same value is produced by two or more expressions the compil er computes the value once saves it and reuses it Redundant Assignment Elimination Often copy propagation and common subexpression elimination op timizations result in unnecessary assignments to variables variables with no subsequent reference before another assignment or before the end of the function The compiler removes these dead assignments Development Support 5 5 TMS320C3x Optimizing ANSI C Compilers Figure 5 1 Data Flow Optimizations for TMS320C31 Compilers simp int j int a int b int c int d call a b c d TMS320C31 compiler output is simp is allocated to user var j is allocated to temp var T 2 is allocated to temp var T 1 2 R0 jJ a 2j 30 RO RC R1 RL 43 j lt lt R1 RC RE b 43 j
25. O optically isolated Daughter boards for other applications such as binary image acquisi tion and general purpose I O are currently under development A seri al port based software monitor program is available to aid with the de velopment of embedded control algorithms TMS320C31 Third Party Support 6 15 Integrated Motion Incorporated Figure 6 2 MX31 Fitted With a Preliminary CCD Camera Interface Daughter Board Insert photo H1 Get this photo from the TMS320 3rd Party Support Reference Guide job 61119 page 3 137 Figure 3 34 6 16 Loughborough Sound Images Ltd 6 8 Loughborough Sound Images Ltd The Technology Centre Epinal Way Loughborough Leicestershire LE11 OQE England 44 509 231843 TMS320C31 PC AT Embedded DSP Board The PC C31 is a 3 4 length PC AT compatible board intended for em bedded signal processing and control applications The board s architec ture gives complete access to all of the TMS320C31 s facilities and adds a variety of peripheral interface options The PC C31 is ideal for a wide range of embedded applications from real time closed loop control to online signal processing The high perfor mance low cost 32 bit floating point TMS320C31 s features make it ideally suited to application areas not previously considered Coupled with LSI s range of peripherals complete application systems can be as sembled quickly and easily Features include Complete TMS320C31 process
26. P D31 D0 E DADDR2 Bus e A23 A0 E y r DMADATA Bus a DMAADDR Bus x h 324 24 324 124 24 ej 1 24 3 Y Y a Program Counter V DMA Instruction Register CPU Controller B u s Y TMS320C31 Device Overview 2 21 Memory Organization 2 3 2 Memory Maps 2 22 There are two TMS320C31 memory maps Use of either one depends on whether the processor is running in the microprocessor mode MCBL MP 0 or the bootloader mode MCBL MP 1 The memory maps are similar see Figure 2 4 All of the memory mapped peripheral registers are in locations 808000h through 8097ffh In both modes RAM block 0 is located at addresses 809800 through 809bF Fh and RAM block 1 is located at addresses 809c00 through 809fffh In microprocessor mode the bootloader ROM is not mapped into the TMS320C31 memory map Locations Oh through OBFh consist of interrupt vector trap vector and reserved locations all of which are accessed over the external memory port STRB active Locations OCOh through O7FFFFFh and locations 80A000h through OFFFFFFh are also accessed using STRB In bootloader mode the bootloader ROM is mapped into locations Oh through OFFFh There are 192 locations Oh through OBFh within this block for the C31 bootloader program Locations 1000h through 07FFFFFh and locations 80A000h through OFFFFFFh are also accessed using STRB Figure 2 4 TMS320C31 Memory Maps Oh OBFh 0COh 7FFFFFh 800000h 807FFFh 808000h 809
27. addr 24 bit immediate address label count Shift value general addressing modes cond condition code see Chapter 11 SP Stack pointer ST status register GIE global interrupt enable register RE repeat interrupt register RM repeat mode bit RS repeat start register TOS top of stack PC program counter C Carry bit TMS320C31 Architectural Overview 2 13 Central Processing Unit CPU Table 2 5 Logical and Bit Manipulation Instruction Summary 2 14 MO A oen ANDNS3 Bitwise logical ANDN 3 operand src1 AND src2 Dreg CMPF Compare floating point values Set flags on Rn src CMPF3 Compare floating point values Set flags on src1 src2 3 operand CMPI Compare integers Set flags on Dreg src CMPI3 Compare integers 3 operand Set flags on src1 src2 OR BiwisetogicalOR_ DregORs Deo src OR src2 Dreg XOR LEGEND src general addressing modes Dreg register address any register srci three operand addressing modes Rn register address R7 RO src2 three operand addressing modes Daddr destination memory address Csrc conditional branch addressing modes ARn auxiliary register n AR7 ARO Sreg register address any register addr 24 bit immediate address label count shift value general addressing modes cond condition code see Chapter 11 SP Stack pointer ST status register GIE global interrupt enable register RE re
28. and for troubleshooting macro defini tions Assembly listing file with line numbers and opcodes Abig memory model with unlimited space for global data static data and constants In the small default model this space is limited to 64K words for faster more efficient coding execution 5 1 1 TMS320C31 Compiler Optimizations The efficiency of a C compiler depends upon the scope and number of op timizations the C compiler performs as well as upon the application The TMS320C31 compiler performs a wide variety of optimizations to improve the efficiency of the compiled code The following list and explanations that follow describe some of the optimizations and highlight particular strengths of the C compilers General Purpose C Optimizations Algebraic reordering symbolic simplification constant folding Alias disambiguation Data flow optimizations m Copy propagation m Common subexpression elimination m Redundant assignment elimination Branch optimizations control flow simplification Loop induction variable optimizations strength reduction Loop rotation Development Support 5 3 TMS320C3x Optimizing ANSI C Compilers BI Loop invariant code motion WE in line expansion of function calls Optimizations Specific to the TMS320C31 compiler Ml Register variables Register tracking targeting Cost based register allocation Autoincrement addressing modes Repeat blocks Delayed branches Use of registers for pass
29. and there is 1 8 megabytes of multiported shared memory on the board All four C31s and the PC hostcan read and write into this shared memory A robust set of tokens semaphores and interrupts facilitates interprocessor communications via software defined memory struc tures Communications with the PC are streamlined by a PC bus l O mapped control port which provides for unintrusive polling operations Realtime voice I O to a standard voice bus Dialogic PEB or Natural Microsystems MVIP is done over the serial port of the C31 via an ASIC interface chip Figure 4 1 VPRO 4 Hardware Architecture 1 to 8M byte BEA DRAM PCM ISA Interface Global PC AT Resources Bus Auxiliary Port Application Examples 4 3 Telecommunications Example Using SPOX 4 1 4 From Tiger 30 to Realtime Recognition VPC developed software with the Tiger 30 development board from DSP Re search Two discrete word recognizers could run on a single C31 ceight rec ognizers on a single ISA board They also used the board to experiment with SPOX to help them understand its capabilities and performance better It took about six months to build the VPRO 4 hardware prototype using the Tiger 30 and SPOX Because the Tiger 30 board did not interface to the voice bus they tested their recognizer with canned voice data stored on the host file system After the VPRO 4 hardware and the necessary low level software for loading and interfacing to the board was compl
30. application functions Without using in lining the TMS320C31 provides 24 876 Dhrystones sec 3 2 1 Dhrystone Benchmark The Dhrystone benchmark was originally used to measure device perfor mance and compiler efficiency in typical host CPU integer applications It does not include input output or operating system operations In Table 3 3 the re sults for Dhrystone version 1 1 are shown due to the widespread availability of processor benchmark results for version 1 1 over later versions of the benchmark 3 2 2 Bubble and Quick Sort Benchmarks The bubble sort program performs a bubble sort on an array of elements and the quick sort program uses the quick sorting algorithm to sort an array of ele ments 3 2 3 matmult Benchmark matmult is a routine that multiplies two 7x7 matrices together The 7x7 ma trices are subsets of 8x8 matrices TMS320C31 Feature and Performance Comparison 3 5 TMS320C31 Benchmark Performance Versus Other Embedded Controllers 3 2 4 anneal Benchmark anneal solves the travelling salesman s problem given a number of cities that the salesman wants to visit find the shortest route to visit all of the cities by visiting each city only once The problem is solved using simulated annealing techniques 3 2 5 Benchmark Summary 3 6 For the system control benchmarks described above the TMS320C31 per forms at the same level as higher priced devices and overall outperforms de vices at the same price
31. bit I O Introduction 1 3 TMS320C31 Key Features DLL L Extensive internal busing and parallelism for extremely fast data move ment capability 8K bytes of single cycle dual access internal RAM support two accesses per machine cycle can act as program memory data memory cache to external memory or register file extensions Memory interface optimized for single cycle SRAM accesses and static column decode DRAMs for high speed external memory access while maintaining low system cost Boot loader to load execute programs from other processors or inexpen sive EPROMS On chip emulation for true nonintrusive visibility and control during debug 132 pin plastic quad flat pack PQFP package Low price The TMS320C31 is described in detail in Chapter 2 Figure 1 1 TMS320C31 Performance Sustained Processing 9 High Performance CPU Sustained l O 9 Primary Bus DMA Controller 9 DMA Controller y 9 Serial Port 50 ns y Cycle Time CPU and DMA PERFORMANCE DATA THROUGHPUT CPU 8 OPS Cycle 160 MOPS Primary Bus 80M bytes sec e 2 Data Accede 40 MOPS Serial Port 2M bytes sec 1 FP Multiply 20 MOPS e 1 FP ALU Operation 20 MOPS TOTAL I O 82M bytes sec 2 Addr Register Mods 40 MOPS 1 Loop Counter Update 20 MOPS e 1 Branch 20 MOPS DMA COPROCESSOR 3 OPS Cycle 60 MOPS e 1 Data Access 20 MOPS e 1 Addr Register Mod 20 MOPS e 1 Transfer Counter 20 MOPS Update TOTAL MOPS 220 MOPS
32. complete implementations of system capa bilities such as FAX modem speech recognition and image com pression Integrated Host Applications To facilitate integration of host application programs with realtime DSP software Spectron provides host computer software that transparent ly controls and communicates with SPOX tasks executing realtime al gorithms on attached DSP hardware Spectrum Signal Processing Inc 6 11 Spectrum Signal Processing Inc 250 H Street P O Box 8110 25 Blaine WA 98230 800 663 8986 FAX 604 438 3046 E DSP PC Single Board Computer Designed for applications such as multimedia the DSP PC single board computer integrates PC technology with DSP technology on a full size IBM AT plug in card A 25 MHz 80386 provides a 100 PC AT compatible platform for running DOS programs such as Microsoft Windows Lotus 1 2 3 and Hypersignal Workstation while a TMS320C31 provides up to 33 MFLOPS of DSP power Features include 2 megabytes of System DRAM expandable to 8 megabytes High performance SCSI interface with 32 bit bus mastering DMA controller Dual floppy disk controller Two serial RS 232 ports Parallel printer port Realtime clock calendar Keyboard and speaker ports TMS320C31 32 bit floating point DSP Media Link high speed bus expansion connector DSP Link Peripherals Spectrum s DSP Link peripherals are compatible with the DSP Link sys tem expansion interface and can be connect
33. control development costs and solve design challenges Further information on courses and schedules in North America can be ob tained by contacting the TTO Central Registration office at 800 336 5236 ext 3904 5 9 6 1 TMS320C3x Design Workshop The TMS320C3x DSP design workshop introduces design engineers to the powerful TMS320C3x generation of DSPs Hands on EVM based exercises throughout the course give the designer a rapid start in utilizing TMS320C3x design skills Experience with digital design techniques is desirable Assembly language experience is required C language programming experience is de sirable Topics covered in the TMS320C3x DSP design workshop include L TMS320C3x architecture instruction set Use of the PC based TMS320C3x EVM Floating point and parallel operations Use of the TMS320C3x assembler linker C programming environment System architecture considerations D C O O O L Memory and I O interfacing L TMS320C3x development support 5 9 6 2 Digital Control Design Workshop 5 36 The digital control design workshop covers all the fundamental issues involved in the design and implementation of physical control systems using TMS320 DSPs The workshop is divided into two major parts The first part covers theory and design of control systems and discusses practical aspects that a control design engineer should be aware of before attempting to implement a controller The second part is devoted to h
34. demand a list of the last 256 scheduled events permitting you to trace the immediate history of the application The second utility a built in work load monitor acts to measure and to redistribute the workload at runtime RTXC and RTXC MP address two important problems First the use of ANSI standard C protects you from technology changes thus preserving the software development investment The easy upgrade path from a single processor version of RTXC to the multiple processor version of RTXC MP ensures that the software investment is future proof Second the difficulties of parallel or distributed programming become less prob lematic through RTXC MP s use of a virtual single processor model The implementation is geared towards maximum performance so that hard realtime constraints are still satisfied even in a multiple processor system architecture A T Barrett 8 Associates Inc RTXC Specifics With an implementation history dating from 1978 RTXC provides a sound foundation for the solution of complex realtime systems It is based on the concept of preemptive multitasking that permits a system to make efficient use of both time and system resources RTXC is distributed in three source code configurations defined by the set of kernel services embodied in each The different configurations are available to meet the real needs of the embedded systems marketplace where there is a wide diversity of functional capabilities required in a
35. devices a simple mat ter Debug of DSP code is supported by LSI s command line MON31 and Win dows 3 0 compatible View31 Both provide a comprehensive range of de bug features View31 allows multiple board debug sessions and the win dows display is configurable to meet the needs of the debug session Sev eral memory areas can be viewed simultaneously while multiple register windows let you view just the registers of interest The LSI high level language interface library allows the integration of the DSP functionality into the host PC Functions are provided to control and pass data to and from the board and the libraries are provided in both Microsoft and Turbo C formats Precise Software Technologies Inc 6 9 Precise Software Technologies Inc 301 Moodie Drive Suite 308 Nepean Ontario Canada K2H 9C4 613 596 2251 613 596 6713 Precise MPX Realtime Multiprocessor Executive Realtime embedded control applications are increasingly being solved by using DSPs instead of CISC based 16 and 32 bit processors The bene fits of using DSPs are increased performance simpler designs and cost effective multiprocessor applications The TMS320C3x devices are cost effective for many embedded applications such as voice or data commu nications controllers LAN controllers peripheral controllers laser print ers and biomedical devices Applications that require additional proces sors to handle high throughput high interrupt
36. functions that fully meets the specification of the SIGAda Numerics Working Group and the Ada Europe Numerics Working Group AdaScope the Tartan Ada source level symbolic debugger Tartan Tool Set consisting ofthe Tartan Ada linker object file librarian file conversions and other utilities E Online help files for the compiler and library interfaces and AdaScope commands The Ada compiler produces fast compact code through Ada specific op timizations optimizations that take advantage of C30 architecture fea tures and a full range of classical optimizations Five optimization levels permit proper optimization strategy at each point in the development cycle Code size is further reduced by Tartan s compact modular runtimes that include only the runtime functionality needed by the application in the ex ecutable image The Tartan linker reduces code size still further by elimi nating unused program sections from the executable image C30 Specific Features Access to many C30 native instructions Circular addressing Bit reversed addressing C30 delayed branch functionality Repeat block and repeat single instructions Compiler switches permit generation of 16 bit PC relative conditional call instructions control of interrupt latency time using the RPTS instruction and specification of the number of wait states for the memory in which the program code is executed TMS320C31 Third Party Support 6 35 Tartan Inc Ada
37. group of benchmarks referred to as the Intel Intro Benchmarks Even though these benchmarks do not necessarily reflect controller performance for many realtime applications the results are presented here to illustrate that high system control performance can be achieved with the TMS320C31 using high level language code Intel Intro Benchmarks results for embedded pro cessors at the same price level as the TMS320C31 are also shown in Table 3 3 to show that the TMS320C31 is a low cost high performance solu tion relative to other embedded controllers TMS320C31 Benchmark Performance Versus Other Embedded Controllers Table 3 3 Benchmark Comparison of the TMS320C31 With Embedded Controllers at the Same Price Level Benchmark Units C3x 1 AMD29000 i960KA 3 68030 3 60 ns 60 ns 2 40 ns 30 ns YARC Board Dhrystones sec 82 237 24 388 23 423 9 049 2 17 192 91 942 45 378 113 062 15 120 14 86 12 67 22 552 Notes 1 The C31 benchmarks were run on the Texas Instruments C3x application board using zero wait state SRAM The C code was compiled using the TMS320 Floating Point DSP Optimizing C compiler The benchmarks yield the same results for both the C30 and C31 2 AMD29000 results are taken from an AMD application note Intel i960CA Benchmark Report Critique by Tim Olson 3 The i960KA and 68030 numbers are from the February 1990 issue of Electronic Engineering 4 Anasterisk denotes compiler in lining of
38. in High Zt Timer Signals 2 Pins Timer clock 0 As an input TCLKO is used by timer O to count external pulses As an output pin TCLKO outputs pulses generated by timer 0 Timer clock 1 As an input TCLKO is used by timer 1 to count external pulses As an output pin TCLK1 outputs pulses generated by timer 1 Supply and Oscillator Signals 49 Pins 1 O Z External H1 clock This clock has a period equal to twice CLKIN 1 O Z External H3 clock This clock has a period equal to twice CLKIN 20 5 Vpc supply pins All pins must be connected to a common supply plane Vss 25 Ground pins All ground pins must be connected to a common ground plane 1 X 1 O Z Output pin from the internal crystal oscillator If a crystal S is not used this pin should be left unconnected X2 CLKIN 4 T 3 o The internal oscillator input pin from a crystal or a clock PF Reserved 4 Pins 1 EMU2 EMUO 3 1 Reserved Use 20 kQ pull up resistors to 5 volts i EMUS Input I output O high impedance state Z S SHZ active H Hold active R Reset active Recommended decoupling capacitor value is 0 1 uF Follow the connections specified for the reserved pins 18 to 22 kQ pull up resistors are recommended All 5 volt supply pins must be connected to a common supply plane and all ground pins must be connected to a common ground plane 20 H 2 32 Chapter 3 TMS320C31 Features Performance Comparison This chapte
39. in different formats Results are displayed as symbolic hex octal and binary radices in a state win dow as waveforms in a timing window and as decoded mnemonics in a disassembly window Display radices can be added or changed at any time without taking a new measurement TMS320C31 Third Party Support 6 9 Biomation 6 10 Decoded instructions for the TMS320C3x processor are displayed in the disassembly window The MAP hardware is capable of capturing all bus cycles The C3x must be executing out of external RAM in order for the disassembler to operate effectively Four disassembly display modes are available display all bus cycles delete non executed cycles delete data read writes and display executed code only These modes allow the dis play to be tailored to your needs Hardware engineers will appreciate Dls play All Bus Cycles while the Display Executed Code Only will look much like the program listing to which a software engineer is accustomed with symbolic labels for addresses Passive Interface Biomation uses passive interfaces in microprocessor probe adapters Passive interfaces bring the processor signals directly to the logic analyz er s high impedance data probes Direct connection to the CPU allows timing measurements to be made directly through the probe Where load ing is critical clock signals have an active buffer on the probe board to en sure proper operation of the system under test Specifi
40. internal bus visibili ty highly pipelined architectures faster cycle times higher density packaging that are inherent to sophisticated VLSI systems Scan based emulation eliminates special bond out emulation devices tar get cable buffer signal degradation and the mechanical and reliability prob lems associated with target connectors and surface mount packaging With scan based emulation your program can execute in realtime from internal or external target memory no extra wait states are introduced by the emulator at any clock speed The TMS320C31 s architecture implements scan based emulation through in ternal shift register scan chains accessed by a single serial interface The scan chains provide access to internal device registers and state machines allowing complete visibility and control This nonintrusive approach even oper ates in a production environment where the DSP is soldered into a target sys tem Since program execution takes place on the TMS320C31 in the target system there are no timing differences during emulation This new design offers signif icant advantages over traditional emulators These advantages include L No cable length transmission line problems Nonintrusive system No loading problems on signals No artificial memory limitations TMS320C3x C assembly source debugger interface Easy installation In system emulation DOC O CO O O O No variance from device s data sheet specifica
41. level For the matmult benchmark the TMS320C31 offers superior results due to its single cycle multiply support on chip These benchmarks focus on CPU performance and do not reflect that the TMS320C31 possesses more on chip peripherals than the other processors shown On chip peripheral integration reduces system cost and complexity and is an important consideration in embedded controller selection Chapter 4 Application Examples This chapter presents four application examples that show how the TMS320C30 and TMS320C31 have been used to integrate system control and signal processing functions in several application areas In two of the ex amples SPOX a realtime embedded operating system from Spectron Micro Systems is used to facilitate the integration For more information on SPOX refer to Chapter 6 The examples discussed are as follows Topic Page 4 1 Telecommunications Example Using SPOX 4 2 4 2 Instrumentation Application and Processor Evaluation Example 4 5 4 3 Test Equipment Example Using SPOX 4 9 4 1 Telecommunications Example Using SPOX 4 1 Telecommunications Example Using SPOX 4 1 1 Speech Recognition With TMS320C31 and SPOX Voice Processing Corp VPC of Cambridge Massachusetts a leader in speech recognition technology develops and markets proprietary technology for speaker independent continuous and discrete word recognition VPC has taken a
42. of the operand Three operand addressing modes Ml Register Same as for general addressing mode BI Indirect Same as for general addressing mode Parallel addressing modes BI Register The operand is an extended precision register BI Indirect Same as for general addressing mode Long immediate addressing mode E Long immediate The operand is a 24 bit immediate value Conditional branch addressing modes Ml Register Same as for general addressing mode Ml PC relative A signed 16 bit displacement is added to the PC The various indirect addressing options available for the C31 are shown in Table 2 2 The table shows the options along with the value of the modifica tion mod field assembler syntax operation and function for each TMS320C31 Device Overview 2 9 Central Processing Unit CPU Table 2 2 Indirect Addressing Indirect Addressing With Displacement 00000 ARn disp addr ARn disp With predisplacement add 00001 ARn disp addr ARn disp With predisplacement subtract 00010 ARn disp addr ARn disp With predisplacement add and modify ARn ARn disp 00011 ARn disp addr ARn disp With predisplacement subtract and modify ARn ARn disp 00100 ARn disp addr ARn With postdisplacement add and modify ARn ARn disp 00101 ARn disp addr ARn With postdisplacement subtract and modify ARn ARn disp 00110 ARn disp addr ARn With postdisplacement add an
43. performance 50 ns instruction cycle 20 MIPS million instructions per second 40 MFLOPS million floating point operations per second 220 MOPS million operations per second see Figure 1 1 on page 1 4 80 Mbytes second l O bandwidth 0 200 us interrupt response 60 ns and 74 ns devices also available Register based pipelined CPU Parallel multiply and arithmetic logical operations on integer or float ing point numbers in a single cycle Eight extended precision registers 24 bit address space Two address generators with eight auxiliary registers two index regis ters and two auxiliary register arithmetic units 32 bit barrel shifter Powerful instruction set Single cycle instruction execution System control and numeric operations Two and three operand instructions Zero overhead looping Single cycle branching Conditional calls and returns Flexible addressing modes including circular addressing and auto increment decrement modes allow high speed data accesses Single cycle parallel math and memory operations Interlocked instructions for multiprocessing support Integrated peripherals DMA controller for concurrent I O and CPU operation Two way set associative instruction cache maximizes performance while minimizing system cost Flexible serial port for 8 16 24 32 bit transfers which can be config ured for general purpose bit I O plus two 16 bit timers Two 32 bit timers which can also be configured for
44. realtime kernel RTXC allows you to license the source code library that most closely fits your needs If you need more capabilities later on there is a simple upgrade path The three source code libraries basic advanced and extended are com patible with each other All of the services in the Basic Library are included in the advanced library All of the advanced library is part of the extended library If you obtain a license to the basic library you can upgrade to either the advanced or extended library without changing the application pro grams developed with the basic library RTXC MP Specifics The range of applications is vast from single processor embedded sys tems to complex control systems with various degrees of fault tolerance and using tens of processors Throughout the spectrum of applications RTXC MP provides transparent distributed realtime processing without the need to change any line of application source code when changing at tributes of system resources for example the location of tasks queues semaphores memory blocks and priority of tasks The transparency simply means that any cluster of processors can be re garded as a single realtime processing engine While processors give you scalable computing power RTXC MP gives you scalable realtime soft ware Transparency is achieved by the implementation of a virtual single processor model The model uses a global naming scheme in which all System resources
45. software tools on the 68000 family based SUN 3 series workstations and on the SUN 4 series machines that use the SPARC processor but not on the SUN 386i series of workstations Part Numbers Table B 2 TMS320C3x Support Tool Part Numbers Concluded Tool Description Operating System Part Number Evaluation Module EVM PC DOS MS DOS TMDS3260030 t Note that SUN UNIX supports TMS320C3x software tools on the 68000 family based SUN 3 series worksta tions and on the SUN 4 series machines that use the SPARC processor but not on the SUN 386i series of workstations Part Ordering Information B 3 Device and Development Support Tool Prefix Designators B 2 Device and Development Support Tool Prefix Designators B 4 Prefixes to Texas Instruments part numbers designate phases in the product s development stage for both devices and support tools as shown in the follow ing definitions Device Development Evolutionary Flow TMX Experimental device that is not necessarily representative of the final device s electrical specifications TMP Final silicon die that conforms to the device s electrical specifications but has not completed quality and reliability verification TMS Fully qualified production device Support Tool Development Evolutionary Flow TMDX Development support product that has not yet completed Texas In struments internal qualification testing for development systems TMDS Fully qualified development suppo
46. subtract ARn ARn IR1 and modify 10110 ARn 1R1 addr ARn With postindex IR1 add ARn circ ARn IR1 and circular modify 10111 ARn IR1 96 addr ARn With postindex IR1 subtract ARn circ ARn IR1 and circular modify Indirect Addressing Special Cases ARn IR0 B addr ARn With postindex IRO add ARn B ARn IRO and bit reversed modify LEGEND addr memory address ARn auxiliary register ARO AR7 IRn index register IRO or IR1 disp displacement add and modify subtract and modify circ address in circular addressing where circular addressing is performed B where bit reversed addressing is performed 2 2 6 Instruction Set Summary The C31 offers instructions for both embedded control and numeric support The following tables show each instruction s mnemonic description and op eration Table 2 3 shows the system control instructions Table 2 4 lists the program flow control instructions Table 2 5 shows the logical and bit manipu lation instructions Table 2 6 lists the load and store instructions Table 2 7 shows the arithmetic instructions and Table 2 8 summarizes the TMS320C31 parallel instructions which execute in a single cycle TMS320C31 Architectural Overview 2 11 Central Processing Unit CPU Table 2 3 System Control Instruction Summary Lemon INTI Operation IACK Interrupt acknowledge Dummy read of src IACK toggled low then
47. the multiplier performs integer multiplication the input data is 24 bits and yields a 32 bit result 2 2 4 Arithmetic Logic Unit ALU The ALU performs single cycle operations on 32 bit integer 32 bit logical and 40 bit floating point data including single cycle integer and floating point con versions Results of the ALU are always maintained in 32 bit integer or 40 bit floating point formats The barrel shifter is used to shift up to 32 bits left or right in a single cycle Internal buses CPU1 CPU2 and REG1 REG2 carry two operands from memory and two operands from the register file thus allowing parallel multi plies and adds subtracts on four integer or floating point operands in a single cycle 2 2 5 CPU Memory Addressing Modes 2 8 The TMS320C31 supports a base set of general purpose instructions as well as arithmetic intensive instructions that are particularly suited for digital signal processing and other numeric intensive applications Central Processing Unit CPU For use with the general purpose and arithmetic instructions five groups of addressing modes are provided on the TMS320C31 Six types of addressing may be used within the groups as shown in the following list General addressing modes Ml Register The operand is a CPU register E Short immediate The operand is a 16 bit immediate value Direct The operand is the contents of a 24 bit address E indirect An auxiliary register indicates the address
48. the phone service providers can now quickly incorporate new voice technology on either VPC s hardware or their own hardware to suit different applications Instrumentation Application and Processor Evaluation Example 4 2 Instrumentation Application and Processor Evaluation Example 4 2 1 Background and System Description Nicolet Instruments developed the first digital oscilloscope 20 years ago They have since developed and marketed a variety of other data acquisition prod ucts based on the concept of digitizing analog waveforms Although they de sign 8 bit digitizers that collect data at rates of up to 200 million samples se cond their product strength is in the higher precision lower speed digitizers 10 to 16 bits wide 1 50 million samples second with very long memories greater than 1 million samples Nicolet s requirements for an embedded pro cessor were low system cost and high data movement and numeric proces sing performance Figure 4 2 is a block diagram of a typical Nicolet high precision data acquisi tion system using a dual processor architecture The master CPU controls the data acquisition subsystem which includes the analog converters digitizer memory and arbitration logic In the current implementation of this architec ture a CISC processor is used as the master CPU The slave processor han dles high speed data transfers within in and out of the system and performs numeric operations on the digitized data
49. time with 275 225 MOPS and 320 256 Mbytes sec respectively There are six communication ports for direct interprocessor or processor I O com munications peripherals A self programmable six channel DMA coprocessor maximizes sustained CPU performance The 512 byte instruction cache memory with two independent 32 bit memory interfaces support shared memory configurations The C4x 40 MHz version is designed for slower speed DSP applications that would benefit from the attributes of a lower priced floating point TMS320C40 processor A 8 TMS320C5x TMS320C5x The TMS320C5x DSPs are the industry s highest performance fixed point DSPs Designed to execute an instruction in 35 ns the C5x is software up wardly compatible with all C1x and C2x DSPs providing a fast performance upgrade path Fast cycle times large on chip memories a parallel logic unit PLU zero overhead context switching and block repeats differentiate the TMS320C5x The C5x has 2 serial ports which can operate in normal or time division multiplexed TDM modes The integration of the JTAG IEEE test bus standard increases system reliability allowing 99 fault grade testing and on chip emulation Spin off devices can be developed rapidly because of the modular design of the C5x TMS320 DSP Family A 13 Appendix B Part Ordering Information This chapter provides the device and support tool part numbers Table B 1 lists the part numbers for the TMS320C30 and TMS320C3
50. workshop 5 36 TMS320C3x workshop 5 36 Tektronix F 37 F 39 telecommunications application example 4 2 4 4 test equipment example 4 9 4 12 third party A T Barrett amp Associates Inc F 5 F 8 Accelerated Technology Inc F 2 F 4 Biomation F 9 F 11 Byte BOS F 12 Computer Motion Inc F 13 Electronic Tools GmbH F 14 Integrated Motion Incorporated F 15 F 16 Precise Software Technologies Inc F 19 F 22 Spectron Microsystems Inc F 23 F 30 Spectrum Signal Processing Inc F 31 F 32 Tartan Inc F 33 F 36 Tektronix F 37 Wintriss F 40 F 41 three operand addressing modes 2 9 Tiger30 4 4 timers 2 26 TMS320 device evolution A 5 family background A 1 A 5 family features and benefits A 8 product roadmap A 4 A 5 TMS320 design workshops applications in C 5 37 digital control design 5 36 TMS320C3x 5 36 TMS320 DSP family overview A 6 TMS320 Programmer s Interface See C assembly source debugger Index TMS320 programmers interface See C assembly source debugger TMS320 support Bulletin Board Service BBS 5 34 custom designed systems 5 37 Customer Response Center CRC 5 34 design services 5 37 Details on Signal Processing newsletter 5 34 hotline 5 34 newsletter 5 34 preview bulletins 5 33 product bulletins 5 33 RTC 5 37 TMS320C1x A 9 TMS320C2x A 10 TMS320C2x C5x compiler introduction 5 1 TMS320C31 architectural overview 2 1 2 32 block diagram 1 5 2
51. 00 Logic Analyzers provide measurement capability for examining high speed CISC RISC ASIC and general logic design including 96 channel module with 50 100 200 MHz capture Measurement widths of up to 384 channels Configurations with 1 to 4 logic analyzers per CLAS 4000 Configurations with 1 or 2 logic analyzers per CLAS 2000 Full speed triggering with multilevel trace control Time stamped transitional recording Disassembling of all DSP instructions Full speed operation for clock and data rates Monitoring of every C3x signal with a single probe connection Small interface probes for dense boards Reliable high speed probing Timing and state measurements made through the processor probe Full symbolic display and triggering for address data and control groups Support for multiprocessor systems Operation Operation is quick and simple To connect to your target just install the probe board between the C3x CPU and its socket Click on the icon repre senting the C3x disassembler setup and the entire logic analyzer will be configured automatically The setup assigns channels to all of the CPU s signals arranges the channels into address data and status groups and sets up the clocking for the C3x Predefined trigger patterns are also pro vided so that you can quickly specify which samples are captured Display Data captured on the CLAS can be viewed simultaneously in several win dows with each window displaying the data
52. 1 FEDERAL REPUBLIC OF GERMANY Texas Instruments Deutschland GMBH Haggertystrasse 1 8050 Freising FR Germany Tel 49 8161 80 0 FRANCE Paris Texas Instruments France 8 10 Avenue Morane Saulnier Borte Postale 67 Velizy Villcoublay Cedex France Tel 33 13 0701001 HONG KONG Texas Instruments Hong Kong Ltd 8th Floor World Shipping Centre 7 Canton Road Kowloon Hong Kong Tel 852 7351223 ITALY Milan Texas Instruments Italia S P A Centro Direzionale Colleoni Palazzo Perseo Via Paracelso North 12 20041 Agrate Brianza MI Italy Tel 39 39 63221 JAPAN Osaka Texas Instruments Asia LTD Osaka Branch Nissho lwai Bldg 5F 2 5 8 Imabashi Chuou Ku Osaka Japan 541 Tel 81 6 204 1881 KOREA Texas Instruments Korea Ltd 28th Floor Trade Tower 159 Samsung Dong Kangnam Ku Seoul Trade Center P O Box 45 Seoul Korea 135 729 Tel 82 2 5512800 SINGAPORE Texas Instruments Singapore Pte Ltd Asia Pacific Division 101 Thomson Road 423 01 United Square Singapore 1130 Tel 65 2519818 SWEDEN Texas Instruments International Trade Corporation Box 30 S 164 93 Kista Isafjordsgatan 7 Sweden Tel 8 752 5800 TAIWAN Texas Instruments Taiwan Ltd Taipei Branch 10 Floor Bank Tower 205 Tung Hua N Road Taipei Taiwan 105 Republic of China Tel 886 2 7139311 UNITED KINGDOM Texas Instruments Ltd Regional Technology Center Manton Lane Bedford Engl
53. 1 register addr R7 RO Src3 register addr R7 RO dst1 register addr R7 RO op3 register addr RO or R1 Central Processing Unit CPU Table 2 8 Parallel Instruction Set Summary Concludea uctions Concluded 0 src2 gt dst src3 dst2 src1 dst1 Src3 dst2 src1 OR src2 gt dst1 Src3 dst2 src1 src2 gt dst1 Src3 dst2 Src1 src2 dst Src3 dst2 src1 XOR src2 gt dst Src3 dst2 Parallel Load Instructions Load floating point src2 dst src4 dst2 src2 dst src4 dst2 Parallel Multiply And Add Subtract Instructions op1 x op2 gt op3 op4 op5 op6 op1 x op2 op3 op4 op5 gt op6 op1 x op2 op3 op4 op5 op6 op1 x op2 op3 op4 op5 op6 ions src1 dst Src3 dst2 src1 dst Src3 dst2 src2 indirect addr disp 0 1 IRO IR1 src4 indirect addr disp 0 1 IRO IR1 dst2 indirect addr disp 0 1 IRO IR1 op6 register addr R2 or R3 0p1 0p2 0p4 0p5 Two of these operands must be specified using register addr and two must be specified using indirect TMS320C31 Architectural Overview 2 19 Memory Organization 2 3 Memory Organization The total memory space of the TMS320C31 is 16 megawords 32 bits each Program data and l O space are contained within this 16 megaword address space allowing tables program code or data to be stored in eit
54. 1 and Table B 2 gives ordering information for TMS320C3x hardware and software support tools An explanation of the TMS320 family device and development support tool prefix and suffix designators follows the two tables to assist you in under standing the TMS320 product numbering system The topics covered and their page numbers include Topic Page Bit Part Numbers alada pala B 2 B 2 Device and Development Support Tool Prefix Designators B 4 B 3 Device Suffixes as B 5 B 1 Part Numbers B 1 Part Numbers Table B 1 TMS320C3x Digital Signal Processor Part Numbers Technology Frequency Type Dissipation TMS320C31PQL40 0 8 m CMOS 40 MHz Plastic 132 pinQFP IAS 1 0 4m CMOS Ceramic 181 pin PGA SMJ320C30HUM28 28 MHz iy Ceramic 196 cin FE SMJ320C30HTM28 SAS 1 0 4m CMOS Ceramic 181 pin PGA SMJ320C30HUM25 25 MHz io gate SMJ320C30HTM25 P TMS320C31PQA 0 8 um CMOS 33 MHz Plastic 132 pin QFP 1 00 W Table B 2 TMS320C3x Support Tool Part Numbers Tool Description Operating System Part Number Software C Compiler amp Macro Assembler Linker VAXVMS TMDS3243255 08 PC DOS MS DOS TMDS3243855 02 SUN UNIXt TMDS3243555 08 MAC MPW TMDS3243565 01 Macro Assembler Linker PC DOS MS DOS OS 2 TMDS3243850 02 Simulator VAX VMS TMDS3243251 08 PC DOS MS DOS TMDS3243851 02 SUN UNIXT TMDS3243551 09 SPOX OS Software for C3x Target Board PC DOS MS DOS TMDS3240132 T Note that SUN UNIX supports TMS320C3x
55. 2 2 3 CPU 2 4 2 19 development support 1 6 1 7 features 1 3 1 4 product objectives 1 2 TMS320C3x A 11 design workshop 5 36 development environment 1 7 simulator 5 22 TMS320C3x Target Board development environ ment 5 30 TMS320C4x A 12 TMS320C5x A 13 user s guides 5 33 Voice Processing Corp VPC 4 2 Wintriss F 40 Index 5 XDS analysis subsystem 5 31 emulator 5 26 5 27 HP 64776 5 31 scan based emulators 5 26 system requirements 5 29 Index 6
56. 3 There is a 90 min ute access limit per day on the bulletin board The BBS is open 24 hours a day ROM code algorithms may be submitted by secure electronic transfer via the TMS320 BBS 5 9 4 TMS320 DSP Technical Hotline 5 34 The TMS320 group at Texas Instruments maintains a DSP Hotline to answer TMS320 technical questions Specific questions regarding TMS320 device problems development tools third party support consultants documenta tion upgrades and new products are answered The TMS320 DSP Technical Hotline is open five days a week from 8 00 AM to 6 00 PM Central Time It is staffed with engineers ready to provide the sup port needed for your TMS320 design or evaluation To assure the maximum support from this service first consult your product documentation If your question is not answered there gather all of the infor TMS320 Technical Support mation that applies to your problem With your information manuals and prod ucts close at hand call TMS320 DSP Technical Hotline 713 274 2320 For realtime transmission of information a facsimile machine is available FAX 713 274 2324 or you may submit information via electronic mail The Hotline Internet address is 4389750 mcimail com The MCI mail address is 4389750 or TMS320 Hotline Questions on pricing delivery and availability should be directed to the near est TI Field Sales Office 5 9 5 TMS320 Application Software To simplify development o
57. 7FFh 809800h 809BFFh 809C00h 809FFFh 80A000h FFFFFFh Interrupt Locations and Reserved 192 Words External STRB Active External STRB Active 8M Minus 192 Words Reserved 32K Words Peripheral Bus Memory Mapped Registers 6K Words Internal RAM Block O 1K Words Internal RAM Block 1 1K Words Internal External STRB Active 8M Minus 40K Words a Microprocessor Mode Memory Organization 0h Reserved for Boot Loader Operations 4K Words FFFh 1000h External STRB Active 400000h 7FFFFFh 800000h Reserved 32K Words 807FFFh 808000h Peripheral Bus Memory Mapped Registers KW 8097FFh 6K Words Internal 809800h RAM Block 0 1K Words Internal 809BFFh 809C00h RAM Block 1 1K Minus 64 Words 809FCOh Internal 80A9F1h User Program Interrupt and Trap Branches 64 Words Internal 809FFFh 80A000h External FFFOOOh STRB Active FFFFFFh 8M Minus 40K Words b Microcomputer Boot Loader Mode TMS320C31 Device Overview 2 23 Internal Bus Operation 2 4 Internal Bus Operation 2 24 A large portion of the TMS320C31 s high performance is due to internal busing and parallelism The separate program buses PADDR and PDATA data buses DADDR1 DADDR2 and DDATA and DMA buses DMAADDR and DMADATA allow for parallel program fetches data accesses and DMA ac cesses These buses connect all of the physical spaces on chip memory off chip memory and on chip peripherals
58. 8 kg 1 75 Ib with cables and probe adapter Temperature 0 50 C noncondensing TMS320C31 Third Party Support 6 11 Byte BOS 6 4 6 12 Byte BOS P O Box 3067 Del Mar CA 92014 800 788 7288 619 755 8836 or Byte BOS Multitasking Operating System Byte BOS Multitasking Operating System BOS is a low cost full fea tured realtime preemptive multitasking operating system and is available for TMS320 DSPs Byte BOS brings the cost of multitasking within reach of all embedded software applications by providing a common code base across a wide range of processors including the TMS320C3x DSPs BOS consists of a C library of realtime multitasking functions with the following features Preemptive and nonpreemptive prioritized task scheduling Task control and management Timer management Event synchronization Message passing Resource management Serial l O management Interrupt stack and nested interrupt handling Low power management Function timeout blocking and nonblocking return TMS320 on chip timer and serial port integration Application code for TMS320 embedded platform External UART serial I O management add on library Fixed block memory management add on library Multiple programmable event timers add on library Multiple message buffers add on library BOSVIEW realtime operating system view port add on library Library and applications code compiler batch and make files Comprehensive reference ma
59. AM 256 words of on chip ROM and up to 128K words of data program RAM A 6 TMS320C3x TMS320C3x The TMS320C3x DSPs incorporate floating point arithmetic and offer the fea tures of a Super computer on a single chip executing more than 33 MFLOPS High performance is gained through large on chip memories 2K words of RAM and 4K words of ROM a concurrent DMA controller and instruction cache 64 words Two serial ports two timers a DMA controller and large on chip system memory are achieved by using a high density CMOS process in corporating 700 000 transistors This high level of on chip integration reduces system cost space and power requirements Because the C3x devices are floating point DSPs numbers no longer need to be scaled thereby simplifying code development Future C3x devices will support applications needing fast er cycle times lower cost and extreme temperature and reliability character ization TMS320C3x development is supported by high level language com pilers C and Ada and the SPOX realtime operating system Scan based emulation is possible through a unique on chip serial scan path which pro vides access to all chip registers TMS320 DSP Family A 11 TMS320C4x A 7 TMS320C4x The TMS320C4x DSPs are the world s first floating point DSPs designed for parallel processing The C4x devices include 40 and 50 MHz versions The CAx CPU features a 40 50 ns single cycle floating point instruction execu tion
60. Contents xiii Tables Lot odd od OND N 00 Y OO 9 69 9 R9 Fox A Re fo ao xiv CPU B6glsters raioni hagi sood Reise rae EO P epebbiSPr rb erW ee 4 Red a a aa ARR DENN SE 2 6 Iriditect Addressirig xe sre Eh m E aree EE TRU UCRRV VET P ERE UP uU EE EE 2 10 System Control Instruction Summary 000 cece eect eee eee eee 2 12 Program Flow Control Instruction Summary 00ec eee eee ee eee eens 2 13 Logical and Bit Manipulation Instruction Summary 00 cece eee eens 2 14 Load and Store Instruction Summary 0 0c cece eee eee eens 2 15 Arithmetic Instruction Set Summary 00sec sees 2 16 Parallel Instruction Set Summary sssssssssssssssse es 2 18 TMS320C31 Signal Descriptions 0 0 0 cece eens 2 30 Description of the Fields in Table 3 2 0 2 2 0 ccc cee 3 2 Feature Performance Comparison of Embedded Controllers ooooo 3 3 Benchmark Comparison of the TMS320C31 With Embedded Controllers at the Same Price Level ooccccccccccccocccncnr hh 3 5 RTC Worldwide Locations 000 c cece tenet eee eens 5 39 TMS320 Family Overview 00 cece tenet eee eens A 6 TMS320 Family Features and Benefits oooooococcocccocncnrr A 8 TMS320C3x Digital Signal Processor Part Numbers 0 cee eeee oo B 2 TMS320C3x Support Tool Part Numbers 0000 nauan eee eee ees B 2 Chapter 1 Introduction
61. DSP with the on chip peripherals of a microcontroller Operating at 25 6 MHz the TMS320C14 offers five to ten times the speed of traditional 16 bit microcontrollers and can execute advanced control algo rithms such as Kalman filters and state controllers for analog type perfor mance On chip peripherals such as event manager with PWM bit I O watch dog timer serial port and baud rate generator reduce chip count resulting in space and cost savings With 4K words of on chip EPROM the TMS320E15 E17 and E14 support realtime code development TMS320 DSP Family A 9 TMS320C2x A 5 TMS320C2x The TMS320C2x DSPs offer from two to four times the performance of the C1x devices Since the TMS320C2x devices are source code compatible with TMS320C1x DSPs they provide an ideal upgrade path for the world s largest installed base of signal processors The TMS320C2x DSPs offer instruction cycle times as fast as 80 ns two to four times the amount of on chip RAM larg er external memory reach 160K multiprocessor capabilities and several additional application specific instructions and addressing modes The E25 offers 4K words of on chip EPROM for realtime code development and proto typing ease The TMS320C2x ROM versions can be used for system cost re duction The C2x DSPs vary in instruction time and memory size and type Specifically the TMS320C25 50 supports 50 MHz 80 ns operation The TMS320C26 offers 1 5K words of on chip data R
62. If Csrc is a register Csrc PC If Csrc is a value Csrc PC PC Else PC 1 PC BcondD Branch conditionally delayed If cond true If Csrc is a register Csrc PC If Csrc is a value Csrc PC 3 PC Else PC 1 PC Bo Branch unconditionally standard Value gt PC BRD Branch unconditionally delayed Value PC E Call subroutine PC 1 gt TOS Value gt PC CALLcond Call subroutine conditionally If cond true PC 1 TOS If Csrc is a register Csrc PC If Csrc is a value Csrc PC PC Else PC 1 PC DBcond Decrement and branch conditionally ARn 1 gt ARn standard If cond true and ARn gt 0 If Csrc is a register Csrc PC If Csrc is a value Csrc PC 1 PC Else PC 1 PC DBcondD Decrement and branch conditionally ARn 1 gt ARn delayed If cond true and ARn gt 0 If Csrc is a register Csrc PC If Csrc is a value Csrc PC 3 PC Else PC 1 PC RPTB Repeat block of instructions src gt RE 1 gt ST RM Next PC RS Repeat single instruction src gt RC 1 ST RM Next PC RS Next PC RE LEGEND src general addressing modes Dreg register address any register src three operand addressing modes Rn register address R7 RO src2 three operand addressing modes Daddr destination memory address Csrc conditional branch addressing modes ARn auxiliary register n AR7 ARO Sreg register address any register
63. Language Features Bl Representation specifications HM Unchecked deallocation and conversion BI Insertion of routines written in machine code Available options include an interface to the Spectron SPOX DSP vector ma trix and filter math functions TI simulator facilities for customizing the run times and the AdaScope hardware interface Figure 6 6 AdaScope Debugger Screen Insert photo M2 Get this photo from the TMS320 3rd Party Support Reference Guide job 61119 page 3 257 Fig 3 66 6 36 6 13 Tektronix P O Box 500 Tektronix Beaverton OR 97077 800 835 9433 503 627 7111 Tektronix offers realtime symbolic debugging support for TMS320 develop ment with their comprehensive line of logic analyzers including the DAS9200 and PRISM 300 Tektronix logic analyzers provide powerful fault triggering ca pabilities coupled with comprehensive mnemonic disassembly support in cluding performance state timing and analog analysis for hardware soft ware and integration applications lt is ideal for the testing and debugging of algorithms on TMS320 hardware See Figure 6 7 DAS9200 Realtime symbolic debugging Support of up to 5000 symbols from your compiler assembler with LA LINK Four disassembly display modes 8K 32K 128K trace buffers Automatic fetch prediction 200 MHz state analysis 2 GHz timing analysis 100 MHz pattern generation Time correlation of up to ten DSPs Hard disk for storag
64. OX application libraries enable many standard C programs normally run on a host computer under UNIX or MS DOS to be literally recompiled and executed faster on attached DSP hardware DSP Math Functions SPOX furnishes over 100 standard math functions that can be used as building blocks for algorithms employed in advanced DSP applications such as Spectron Microsystems Inc E Vector functions arithmetic and logical operations dot product con volution correlation FFT windowing LPC analysis BI Matrix functions arithmetic and logical operations row and column manipulation matrix multiplication 2 D FFT BI Filter functions FIR IIR and LMS adaptive filtering The goal of the SPOX math library is to allow DSP application developers to write as much of their program in C as possible without sacrificing over all system performance To accomplish this goal all SPOX math functions are optimized in assembly language Just as importantly they are tightly integrated into the SPOX memory management and I O system so that critical data operated by the math algorithms is situated in the appropriate memory and the overhead incurred in exchanging data between I O streams and math algorithms is kept to a minimum Multiprocessing Systems SPOX addresses the needs of multi DSP applications with a set of func tions that extend the multi tasking I O and memory management capabil ities of SPOX OS from a uniprocessor to a multiprocessor a
65. RIC Locations viuda adi 5 39 6 TMS320C31 Third Party Support sseeseeeeeee eee eee 6 1 6 1 X Accelerated Technology Inc 0 cece eee eee 6 2 6 2 A T Barrett amp Associates INC 2 0 nunnan unnan nananana 6 5 6 9 IBIOMMAUOMN art a A daa A dea oot a ew ee eels See E 6 9 64 IBYte BOSsistsstedsrteerbiadiet heo bad ny es DUNS LUNES ieee da ve 6 12 A Contents 6 5 Computer Molon HG iii roten Tn ie a ei 6 13 6 6 Electronic Tools GmbH sypia paai aaas aada a e 6 14 6 7 Integrated Motion Incorporated 0 eects 6 15 6 8 Loughborough Sound Images Ltd 0 0 cee teens 6 17 6 9 Precise Software Technologies Inc 0 00 cece eet eens 6 19 6 10 Spectron Microsystems Inc s ssri 0 6 0 6 23 6 11 Spectrum Signal Processing Inc 00 c cee 6 31 S MEracoB CT arrasado 6 33 6719 JTekt OnbC deseos trohe EPE NERAN A EEA RERE IRAKERE EAER ARN 6 37 O WintilSS eseria Raa EE EEA E leds 6 40 IMS320 DSP Family ooo A eee eee A 1 Al The DSP Market iio imas a A o ENE A 2 A 2 The TI Role in the DSP Industry 0 teens A 3 A 3 The TMS320 Product Roadmap 0 00 eee ene eens A 4 A4 TMS320C DC 1 otuud enit Hoe dator a did bee Paced dadas end A 9 A5 IMS32062X started aa data ERR RUE A 10 Ao TMS32069X aviar PIRE FR YR Ne io Et A 11 AGE C TIMS32064X una rra a KR RR A Nr a RR UE A 12 AS IMS32065X t oLertBee9kezee9b2bektle erg ia e qr bkererryrerki 4r A 13 Part Ordering Information
66. SP system based on the Texas Instrument s TMS320C31 and is not larger than the size of a credit card The module addresses two significant areas of DSP based system de sign it can either be used as a fully functional development system on which algorithms can be rapidly implemented and debugged or as a mod ule which is easily integrated into any user s end system The module is particularly attractive for low to medium volume embedded solutions re quiring a fast turnaround time as it may be designed into any industrial product just like a large IC This proven platform manufactured in SMD technology offers a number of standardized intefaces which allow full ac cess to all of the DSP s features Compatibility is guaranteed with other products of Electronic Tool s miniKit range Debugging is performed on a PC with the TI db30 source level debugger which is linked to mini Kit 320C31 via a small PC controller board and the emulation port of the TMS320C31 A rich set of software utilities ensure that all steps from algo rithm implementation in C or assembler code right down to programming miniKit s boot EPROM can be achieved on the fly Credit card sized DSP system 85mm x 61mm TMS320C31 33 MHz 128K x 32 zero wait state static RAM 64K x 8 boot EPROM booting possible via EPROM host Interface RAM or serial interface Watchdog timer Power failure detection Battery backup miniBus interface standardized 16 bit parallel bus for attaching
67. TMS320C31 Development Support 1 4 TMS320C31 Development Support The C31 s general purpose 32 bit architecture and T s comprehensive set of development tools make designing systems with a C31 as easy as design ing with a traditional controller These tools include LDLDDLDDDLD DDLD DLDLD ANSI compatible optimizing C compiler Realtime operating system support The programmer s interface a window based C source assembly de bugger Code profiler Software simulator Low cost evaluation module EVM TMS320C3x XDS scan based emulator C3x application board HP64700 analysis subsystem Extensive third party support Hotline support Bulletin board support Thousands of pages of application notes and technical documentation A complete description of TMS320C31 development support can be found in Chapter 5 Figure 1 3 illustrates the C31 development flow TMS320C31 Development Support Figure 1 3 TMS320C3x Development Environment Assembler Source Assembler C Compiler Assembler Source Macro Library o e Library of Object Files MEA Executable Object File Object Format Converter EPROM XDS TMS320C31 performance benchmarks can be found in Chapter 3 and de tailed system examples are shown in Chapter 4 Introduction 1 7 Benefits of a TMS320C31 Based Embedded System 1 5 Benefits of a TMS320C31 Based Embedded System The device price d
68. The TMS320C31 peripherals include two timers and one serial port Figure 2 5 shows the peripherals with associated buses and signals Figure 2 5 Peripheral Modules lt 0302 lt DOVOO Q Serial Port 0 Sei Aena DES o oo DECENT Port Control Register FSXO R X Timer Register Data Transmit Register DXO CLKXO FSRO 04020 30 U Data Receive Register Timer 0 Global Control Register Timer Period Register QUO D GO Oo gt 7 uv Timer Counter Register Timer 1 Global Control Register DRO CLKRO TCLKO Timer Period Register Timer Counter Register TMS320C31 Device Overview TCLK1 2 25 On Chip Peripherals 2 5 1 Timers 2 5 2 Serial Port 2 26 The two timer modules are general purpose 32 bit timer event counters with two signaling modes and internal or external clocking Each timer has an I O pin that can be used as an input clock to the timer or as an output signal driven by the timer The pin may also be configured as a general purpose I O pin The TMS320C31 offers a full duplex synchronous serial port which can be used as a general system interface or glueless logic connection to an external analog converter The serial port can be configured to transfer 8 16 24 or 32 bits of data per word The clock for each serial port can originate either internal ly or externally An internally gene
69. amily Development Support Reference Guide job 61136 page 5 34 Figure 5 12 Development Support 5 31 HP 64776 Analysis Subsystem Key features of the subsystem include 64 analysis channels that can trace the TMS320C3x s primary or expan sion bus as well as status information Nonintrusive analysis lets you view the processor s bus cycles in realtime Analysis can be performed on the following signals A0 A23 XAO XA12 INTO INT3 primary bus address expansion bus address DO D31 XDO XD32 TCLKO primary bus data expansion bus data STRB MSTRB TCLK1 RW OSTRB XFO HOLDA ACK XF1 Trace specifications that can be set up easily using address data and status event comparators A range comparator can also be used to qualify addresses or data Hardware breakpoint capabilities that enable you to detect a specified event and stop the processor Once the processor is stopped the debug capabilities of the TMS320C3x XDS facilitate isolation of target s hard ware software problems The ability to drive triggered signals to and receive them from other instru ments such as logic analyzers and oscilloscopes allowing synchronized measurements between tools The HP 64776 operates on PC AT platforms utilizing DOS version 3 0 or higher 5 32 TMS320 Technical Support 5 9 TMS320 Technical Support 5 9 1 Technical Documentation A wide variety of technical litera
70. and MK41 7PA Tel 44 234 270111 5 41 Chapter 6 TMS320C31 Third Party Support This chapter lists third party manufacturers and suppliers alphabetically by name and describes their current C31 products The third parties discussed in this chapter include Topic Page 6 1 Accelerated Technology Inc 0 cc eee e tenner ecenereee 6 2 6 2 A T Barrett amp Associates Inc esanai reena ansaa eee 6 5 6 3 Biomation a aaa 6 9 6 4 Byte BOS aa E arene 6 12 6 52 cComputerMotionmlnc s E E 6 13 6 6 Electronic Tools GmbH eere EET 6 14 6 7 Integrated Motion Incorporated 0 0c cece eee eee eee 6 15 6 8 Loughborough Sound Images Lid 6 17 6 9 Precise Software Technologies Inc 000eee seen eens 6 19 6 10 Spectron Microsystems Inc eeeeeeeeeeeeeee 6 23 6 11 Spectrum Signal Processing Inc leeueeuueeeee 6 31 Gi Lelnet de oonoescpononsoncaogondcsoadannonporonaanccooaobnar 6 33 6T Watt pororacncopaaonpocanonca ona conosco anno cana ana 6 37 6 14 Wintriss att m na rre an 6 40 6 1 Accelerated Technology Inc 6 1 Accelerated Technology Inc 6 2 P O Box 850245 Mobile AL 36685 800 468 NUKE 205 661 5770 Nucleus RTX Nucleus RTX is a multitasking executive specifically designed for realtime embedded applications using the TMS320C3x microprocessors Nucleus provides applications with advanced realtime facilities th
71. ands on experience with TMS320C25 DSPs to demonstrate and practice control implementation ex amples A design and implementation software package is used to test algo rithms on an actual motor positioning system Topics covered in the digital control design workshop include L System modeling L Stability analysis TMS320 Technical Support Analysis of numerical problems Quantization effects Truncation rounding and scaling issues O O O O Sampling rate selection L Algorithm structural optimization 5 9 6 3 Applications in C Design Workshop The Applications in C design workshop is an advanced C programming course which is tailored for practical hands on applications using Turbo C and the TI TMS320C3x C compiler This course is for hardware and software engineers with a background in programming and an introductory knowledge of C The course centers around data structure concepts illustrated with appli cation examples Program examples include file filters sorting Huffman cod ing for data compression memory management graphics algorithms and other utilities Topics covered in the Applications in C design workshop include Review of C language syntax and conventions Data structures constructs and concepts Optimization and efficiency techniques Arrays and pointers O O O O Portability issues L Algorithms FFT discrete transforms bit manipulation etc 5 9 7 Design Services The Tl
72. are known system wide The use of the global naming scheme relies on the embedded router in RTXC MP The RTXC MP router which supports up to 64K processor nodes and 64K tasks is attractive for pure communication applications The routing tables automatically gen erated by RTXCgen from the link connections table allow you to write all communication between tasks as if they were located on the same proces sor Under high communication loads prioritized handling in the router avoids lower priority messages blocking higher priority messages While the single processor kernel RTXC can be used with multiple pro cessors if you define your own communication protocols the distributed TMS320C31 Third Party Support 6 7 A T Barrett amp Associates Inc 6 8 version frees you from this burden Moreover because RTXC MP uses a message based mechanism ports to common memory local memory and LAN based systems can easily be done A distributed I O library and graphics server is also available for RTXC MP The design philosophy behind RTXC MP has proven to be a major step forward to shield software applications from technology changes It offers a future proof environment for the transparent development of scalable realtime software on scalable processor hardware Biomation 6 3 Biomation 19050 Pruneridge Ave Cupertino CA 95014 800 944 2466 FAX 408 988 1647 CLAS 2000 and CLAS 4000 Logic Analyzers The CLAS 2000 and CLAS 40
73. areas You can disable portions of a profile area to prevent them from adding to the statistics This is convenient for removing the timing impact of standard library functions or a fully optimized portion of code E Simplicity The profilers simple setup default configurations canned commands and inherent flexibility facilitate sophisticated profiling within a short time TMS320C31 Assembly Language Tools 5 3 TMS320C31 Assembly Language Tools The TMS320C31 assembly language tools are code generation tools that convert assembly language source files into executable object code Key fea tures include a O LE Macro capabilities and library functions Conditional assembly Relocatable modules Complete error diagnostics Symbol table and cross references The assembler translates assembly language source files into machine lan guage object files Source files can contain instructions assembler directives and macro directives Assembler directives control various aspects of the as sembly process such as the source listing format symbol definition and the way the source code is placed into sections The assembler has the following features LJ O Processes the source statements in a text file to produce a relocatable ob ject file Produces a source listing if requested and provides control over this list ing Appends a cross reference listing to the source listing if requested Allows segmentation of u
74. ariables arrays structures pointers any kind of data in their natural format float int char enum or pointer En tire linked lists can be displayed see Figure 5 9 Patch assembler You can modify code from the debugger command line without reassembling your assembly source TMS320 Programmer s Interface C Assembly Source Debugger Figure 5 9 Debugger s Data Display DISP str WATCH b a 123 A TES ELAAMO b 0 J 2 FO 1 000000e c 75435 DISP str 3 3 3 color GREEN sail sl a 8327 A 2 6 b 666 J f3 Ox00f c 87213 DISP str f3 gt 3 z f4 1 45 a 75 A 2 27 b 3212 f3 0x00f c 782 f4 E y f2 9 v f3 0x00f000a Y 4 P Powerful command set The TMS320 debugger supports a small but powerful command set that makes full use of C expressions One debug ger command performs actions that might require several commands in another system Compatibility The TMS320C31 C source debugger runs on IBM PC ATs and compatible PCs For the simulator the debugger is available on Sun workstations Profiler The C source debugger has an option for profiling software When you are deciding whether to convert portions of a program from C to assembly itis helpful to know which functions take the most time A pro filer that measures the amount of execution time in different functions or portions of a program is very helpful The profiler is easy to use and pro
75. artan Inc F 33 assembly source debugger 5 15 autoincrement addressing modes 5 11 auxiliary register ALUs 2 8 BBS See Bulletin Board Service benefits C31 based embedded system 1 8 1 10 Index Biomation F 9 F 11 block diagram TMS320C31 2 3 Bulletin Board Service 5 34 bulletins 5 33 bus operation external 2 28 internal 2 24 Byte BOS F 12 C compiler TMS320 5 2 C source debugger 5 15 C assembly source debugger See TMS320 pro grammers interface cache memory 2 20 See also memory central processing unit 2 4 2 19 code generation tools assembler 5 19 C compiler 5 2 linker 5 19 macro assembler 5 19 COFF 5 19 5 20 compatible devices TMS320C3x 1 5 compiler addressing modes 5 11 algebraic reordering 5 4 branch optimizations 5 6 code motion 5 7 conditional instructions 5 13 constant folding 5 4 control flow 5 6 copy propagation 5 5 data flow optimizations 5 4 delayed instructions 5 12 disambiguation 5 4 function calls 5 7 Index 1 Index compiler continued inline expansion 5 7 loop induction variable optimizations 5 7 loop rotation 5 7 loop unrolling 5 14 loop invariant code motion 5 7 parallel instructions 5 13 redundant elimination 5 5 register allocation 5 11 register targeting 5 10 register tracking 5 10 register variables fixed point 5 10 floating point 5 10 repeat blocks 5 11 rotation 5 7 strength reduction 5 7 subexpression elimination 5 5 sym
76. ary bus is the external memory interface The primary bus consists of a 24 bit address bus 32 bit data bus and a set of control sig nals It can be used to address external program data memory or I O space The bus has an external ready signal RDY which can be used in conjunction with the on chip software for controlled wait state generation See Table 2 9 for a description of the TMS320C31 external signals 2 7 1 External Bus Control Features The TMS320C31 external bus provides flexibility to implement different types of memory systems The STRB control signal remains active between consec utive read cycles to the same bank of memory allowing high speed SRAM and static column decode accesses In addition the primary bus has a program mable bank switching feature providing more time for address decoding and memory turn off when a bank boundary is crossed 2 7 2 Multiprocessor Support 2 28 The TMS320C31 supports shared memory multiprocessor systems through its HOLD and hold acknowledge HOLDA signals When the HOLD input is asserted the primary bus control address and data bus signals go into a high impedance state after the current bus cycle is complete The HOLDA output acknowledges that the C31 primary bus has gone into high impedance state Interlocked operations ease the implementation of multiprocessor operations such as busy wait loops shared counter manipulation and semaphores The TMS320C31 supports i
77. at encompass management of task execution task communication and synchronization system resources predefined memory partitions and dynamic length memory Nucleus RTX facilities are designed to operate in a consistent reliable and efficient manner Each task executing under Nucleus has a priority When multiple tasks are ready to execute the task with the highest priority is executed first Tasks of the same priority execute in a first in first out FIFO manner In addition to the many standard realtime facilities Nucleus also provides facilities such as task priority modification task time slicing item sizes for communication queues defined by the user suspen sion of full queues Suspension on multiple empty queues both types of memory management suspension on unavailable memory and event flag consumption Additionally any Nucleus task suspension can be given a maximum amount of time to stay suspended Software Products Accelerated Technology offers other realtime software products for use with the TMS320C3x generation These include a multitasking debugger a reentrant C library an MS DOS compatible file system and in the near future networking support in the form of TCP IP protocols The Nucleus debugger provides access to all Nucleus structures in a user readable fashion Control structures for tasks queues semaphores event flags and memory management are all available for inspection Ad ditionally the Nucleus debugg
78. bolic simplification 5 4 TMS320 optimizing ANSI C 5 2 TMS320C25 5 1 TMS320C26 5 1 TMS320C50 5 1 TMS320C51 5 1 unrolling 5 14 Computer Motion Inc F 13 conditional instructions 5 13 conditional branch addressing modes 2 9 constant folding 5 4 control Tartan Inc F 34 copy propagation 5 5 CPU 2 4 CPU registers 2 6 auxiliary ARO AR7 2 7 block size BK 2 7 data page pointer 2 7 extended precision RO R7 2 7 I O flags IOF 2 7 index IR1 IRO 2 7 interrupt enable IE 2 7 interrupt flag IF 2 7 program counter PC 2 8 2 24 repeat count RC 2 8 repeat end address RE 2 8 repeat start address RS 2 8 status register ST 2 7 system stack pointer SP 2 7 CPU1 2 buses 2 24 Customer Response Center CRC 5 33 Index 2 data sheets 5 33 data acquisition equipment 4 5 debug and system integration tools analysis subsystem 5 31 assembly source debugger 5 15 C source debugger 5 15 debugger 5 15 emulators 5 26 evaluation module EVM 5 24 HP 64776 5 31 simulator 5 21 debugger 5 15 display basic 5 15 delayed instructions 5 12 design assistance 5 37 Details on Signal Processing 5 34 disambiguation 5 4 DMA architecture 2 27 buses 2 24 general 2 27 Doble M series system 4 10 Doble test 4 9 4 12 documentation 5 33 DSP Bulletin Board Services BBS 5 34 Details on Signal Processing newsletter 5 34 Hotline 5 34 seminars 5 35 DSP industry Tl role A 3 DSP mar
79. cations Signals Monitored Two 96 channel pyramid measurement modules per CPU support full TMS320C3x disassembly Additional pyramid modules can be added to monitor other system signals Input Impedance The input impedance of all signals are 1 MQ shunted by 8 pF except STRB RDY MSTRB IOSTRB XRDY and H1 Input im pedance on these signals are approximately 500 kQ shunted by 16 pF Sampling External clock DC to 50 MHz Internal clock 100 ms to 5 ns Setup time 7 0 ns typical reduced to 4 ns with timebase sync probe Hold time 0 ns Power All MAP poweris provided by the CLAS chassis No power is required from the target system Mechanical Connection to the target is made using a 190 pin PGA package 15 x 15 grid mounted on the MAP probe adapter The probe adapter is placed be tween the CPU and its socket A zero insertion force ZIF socket is in cluded but can be removed when space is limited Biomation Probing Considerations The MAP probe adapter is made as small as possible to allow an easy con nection when other chips are mounted next to the CPU The probe adapter extends a maximum of 1 5 cm 0 6 in from the chip on the sides and 8 6 cm 3 4 in along the back Miscellaneous Size Interface Box 4 0 cm 1 6 in high 21 3 cm 8 4 in wide 22 9 cm 9 0 in deep Probe Adapter 2 1 cm 0 8 in high with ZIF 6 5 cm 2 5 in wide 13 7 cm 5 4 in long Cable 34 cm 13 5 in long Weight 0
80. cc eee 4 9 4 3 1 TMS320C30 and SPOX Merging DSP and Control 4 10 4 3 2 From Proof of Concept to the Final Product 2 ceeeeeee 4 11 5 Development Support ooocccooccnn nnn nnn nn nsnm 5 1 5 1 TMS320C3x Optimizing ANSI C Compilers 0000 cee eee eee 5 2 5 1 1 TMS320C31 Compiler Optimizations 0c eee 5 3 5 2 TMS320 Programmer s Interface C Assembly Source Debugger 5 15 5 3 TMS320C31 Assembly Language Tools 0 ccc eee eee teens 5 19 5 4 TMS320C3x Software Simulator 000 cee e es 5 21 5 5 TMS320C3x Evaluation Module 000 c cece eee eens 5 24 56 TMSS320G3x Emulators 6 0 05 einir er here edt hh RR REX RE dia E N a a 5 26 5 7 TMS320C3x Application Board With Software Demo ssllluuuuuss 5 30 5 8 HP 64776 Analysis Subsystem 000 cece es 5 31 5 9 TMS320 Technical Support ocococcccccccc ete eens 5 33 5 9 1 Technical Documentation ssssssesesssssesss eee 5 33 5 9 2 Details on Signal Processing Newsletter 0 0 eee eee eee eee 5 34 5 9 3 TMS320 Bulletin Board Service 0 00 eee 5 34 5 9 4 TMS320 DSP Technical Hotline 0 cece ee eee 5 34 5 9 5 TMS320 Application Software 00 cece eee 5 35 5 9 6 Design Workshops sce cide iii bares eb Rota wk bae ed RR RI 5 35 5 9 Design Services ccissodeses kae A A Ped a eae ed acce 5 37 5 9 8
81. code was kept short and was written entirely in as sembly language Realtime I O and instrument control were performed with an existing attached microprocessor board with an Intel 80186 running a com mercial realtime operating system Problems with this black box approach in dicated that what Doble needed was a more programmable DSP platform that could handle both signal processing and realtime instrument control When Doble went from the experimental system to a production system their engineers evaluated six DSPs The TMS320C30 offered a general purpose architecture that could perform both realtime control and signal processing functions The floating point arithmetic capability made data analysis easier because it guaranteed sufficient accuracy in the analysis algorithms over a wide dynamic range Doble engineers also evaluated C compilers for the C30 and other DSPs the C30 C compiler clearly generated better code When they learned of the Spectron Microsystems SPOX operating system they were ready to revise the architecture of the system the realtime I O and instru Test Equipment Example Using SPOX ment control functions ofthe 80186 and the traditional signal processing func tions of the fixed point DSP would be performed by the TMS320C30 Using SPOX would also allow Doble to use an object oriented approach to all of their software development and help them make their code maintainable and easy to modify 4 3 2 From Proof of Conc
82. d 00064 case 3 xc b 555 r 0207e 00 0002e 00065 c 75435 DISP astr 7 4 9207 00 0207 00066 as 0 0 s c 6 ig e 02080 d363ae8a 02081 379d0aaa scrolling data OOO d 02082 fe3567bb Y displ 9 02083 9bfa3b3a isplays wit interactive gt comuawD HE 202084 edi is dn x a command entry 2 s st Ho os 02086 9cb5a158 i ditis struct xxx str Loa 02087 fabe82a8 interactive an istor 02088 8ea99a24 un a y step 9 789 02089 8644d8a1V editing window 0208a 8ab705b5 gt gt gt J 0208b 52b9188c The debugger is easy to learn and use Its window mouse menu oriented interface reduces learning time and eliminates the need to memorize complex commands The debugger s customizable displays and flexible command entry let you develop a debugging environment that suits the system s needs Development Support 5 15 TMS320 Programmer s Interface C Assembly Source Debugger 5 16 see Figure 5 8 A shortened learning curve and increased productivity re duce the software development cycle speeding products to market Conditional execution and single stepping including single stepping into and over function calls give you complete control over program execution A breakpoint can be set or cleared with a click of the mouse or by typing com mands Amemory map identifies the portions of target memory that the debug ger can access and that can be defined You can load only the symbol tables
83. d circular ARn circ ARn disp mody ARn circ ARn disp circular modify Win preindex FO add addr ARn IRO With preindex IRO subtract 01010 ARn addr ARn IRO With preindex IRO add and modify ARn ARn IRO 01011 ARn IRO addr ARn IRO With preindex IRO subtract and modify ARn ARn IRO 01100 ARn IRO addr ARn With postindex IRO add and modify ARn ARn IRO 01101 ARn IR0 addr ARn With postindex IRO subtract and modify ARn ARn IRO 01110 ARn IR0 addr ARn With postindex IRO add and circular ARn circ ARn IRO modify 01111 ARn IRO 6 addr ARn With postindex IRO subtract and circular ARn circ ARn IRO modify LEGEND addr memory address ARn auxiliary register ARO AR7 IRn index register IRO or IR1 disp displacement add and modify subtract and modify circ address in circular addressing 96 where circular addressing is performed 2 10 Central Processing Unit CPU Table 2 2 Indirect Addressing Concludea 10000 With preindex IR1 add addr ARn IR1 With preindex IR1 subtract 10010 4 ARN IR1 addr ARn IR1 With preindex IR1 add ARn ARn IR1 and modify 10011 ARn IR1 addr ARn IR1 With preindex IR1 subtract ARn ARn IR1 and modify 10100 ARn IR1 addr ARn With postindex IR1 add ARn ARn IR1 and modify 10101 ARn 1R1 addr ARn With postindex IR1
84. e TMS320C31 Third Party Support 6 37 Tektronix PRISM 3000 Realtime symbolic debugging Support of up to 1500 symbols from your compiler assembly with LA LINK Realtime performance analysis Four disassembly display modes Automatic fetch prediction 200 MHz timing analysis Time correlation of up to four DSPs Choice of lab or field portable units Integrated digital scope module Hard disk for storage Figure 6 7 Logic Analyzer Family Insert photo O2 Get this photo from the TMS320 3rd Party Support Reference Guide job 61119 page 3 266 Figure 3 71 6 38 Tektronix 1240 1241 Logic Analyzer Tektronix supports TMS320 development on their 1240 1241 Logic Ana lyzer The 1240 1241 Logic Analyzer provides complete state and timing analysis support for hardware software and integration applications It is ideal for the testing and debugging of algorithms on TMS320 hardware Powerful triggering dual timebase and mnemonic disassembly make the 1240 1241 a valuable tool for developing processor based products TMS320C31 Third Party Support 6 39 Wintriss Engineering Corporation 6 14 Wintriss 4715 Viewridge 200 San Diego CA 92123 800 733 8089 EVB Evaluation Board The WECO EVB is a complete low cost PC AT TMS320 evaluation board Models are available for the C31 The EVB contains a wire wrap area for system prototyping purposes and full access by standard PC I O functions Dual ported memory p
85. e 5 5 Figure 5 4 Register Variables and Register Tracking Targeting int gvar reg int i int j call amp i gvar j gvar 1 j TMS320C31 compiler output is reg ji q R4 is allocated to user var R5 is allocated to user var CALL call RO call AND R4 RO RO amp i STI RO _gvar gvar RO ADDI R4 RO R5 tracks gvar in RO targets result into R5 j The compiler allocates local variables i and j into registers R4 and R5 as indicated by the comments in the assembly listing Allocating i to R4 and tracking gvar in RO allows the sum gvar i to be computed with a 3 operand instruction targeting the result directly into j in H5 Register Tracking Targeting The compiler tracks the contents of registers so that it avoids reloading values if they are used again soon Variables constants and structure references such as a b are tracked through both straight line code and forward branches The compiler also uses register targeting to compute expressions directly into specific registers when required as in the case of assigning to reg ister variables or returning values from functions See Figure 5 4 5 10 TMS320C3x Optimizing ANSI C Compilers Cost Based Register Allocation The compiler when enabled allocates registers to user variables and com piles temporary values according to their type use and frequency Variables used within loops are weighted to have
86. ed branch functionality Bl Repeat block and repeat single instructions TMS320C31 Third Party Support 6 33 Tartan Inc 6 34 Compiler switches permit generation of 16 bit PC relative conditional call instructions control of interrupt latency time using the RPTS instruction and specification of the number of wait states for the memory in which the program is executed The Tartan Ada Librarian implements the Ada language requirements for separate compilation and dependency control It supports multiple li braries and multiple accesses It also permits usage of non Ada object files within an Ada program The Tartan linker is a fast flexible linker for embedded Ada programs It supports precise control over placement of code data and constants for individual packages modules sections and subprograms in memory It eliminates unused program sections from the executable program images including as much of the highly modularized Tartan Ada runtimes as possible An interface to the Texas Instruments TMS320C3x cross as sembler is also provided including conversion of the output to Tartan s ob ject file format The Tartan AdaScope debugger provides complete window oriented source level symbolic and assembly level debugging for Ada programs using Ada like commands It operates remotely from the host system to the DSP processor using the TI XDS500 controller or it can be run entire ly on the host using the simulator The Tartan
87. ed to any DSP system or pro cessor board DSP Link specifications are available for custom interfac ing Following are brief descriptions of Spectrum DSP Link peripherals E 4 Channel Analog I O Board Four 12 bit input channels 58 kHz channel with quad synchronous sample and hold two 12 bit output channels third order low pass resistor programmed filters on input and output DSP Link data transfer interface E 32 Channel Analog Input Board 32 12 bit input channels 7 kHz channel with 4 channel synchronous sample and hold 32 first order TMS320C31 Third Party Support 6 31 Spectrum Signal Processing Inc 6 32 low pass resistor programmed input filters 32 input buffer amplifiers DSP Link data transfer interface Pro Audio Board AES EBU interface 48 44 1 32 kHz clock word sync DSP Link data transfer interface Pro Audio Board AES EBU interface SONY PCM interface MIDI interface 16x16 cascadeable RAM 48 44 1 32 kHz clock word sync DSP Link data transfers interface DSP Link Prototype Module DSP Link slave wire wrap interface for easy design of custom peripherals buffered data decoded address R W strobes DSP Link Dual Processor Communications Module Allows two pro cessors to communicate via DSP Link Tartan Inc 6 12 Tartan Inc 300 Oxford Dr Monroeville PA 15146 412 856 3600 FAX 412 856 3636 Tartan Compilers Tartan Inc develops full function Ada optimizing compilation sys
88. em 00202 cee eee 1 8 TMS320C31 Architectural Overview sseseeeeeeeeeeeee nnn 2 1 2 14 TMS320C31 Block Diagram sssssssesssssssss en 2 2 2 2 Central Processing Unit CPU 0 cece nh 2 4 2 2 1 GPU Register File i1 aries darker xe Ren c dede 2 6 2 2 2 Auxiliary Register Arithmetic Units ARAUS 000202 eee eee 2 8 2238 Multiplier 52353239 a a ends eee eas ae i d Re n E ER 2 8 2 2 4 Arithmetic Logic Unit ALU sssssseeeeee II 2 8 2 25 CPU Memory Addressing Modes 0 00 eee esses 2 8 2 2 6 Instruction Set Summary scrisese darra ri Teaia eh 2 11 2 3 Memory Organization sssssssssssess sh 2 20 2 9 1 RAM ROM and Cache cssaneek ete RE PLE tia REPLY 2 20 2 9 2 Memory Maps cies css it sedit A ab a EUR eats ad 2 22 2 4 Internal Bus Operation ssssssssssssssssessss en 2 24 25 On Chip Peripherals iaceo hber ee ir e ea 2 25 2 5 1 TIMES iode e eor elt ae ith edad ied addenda ud 2 26 292 Seal PO iii pebeta 2 26 2 6 Direct Memory Access DMA 00 0 cece e 2 27 2 7 External Bus Operation 0000 cece eee eens 2 28 2 7 1 External Bus Control Features 0 00 eee eee 2 28 2 7 2 Multiprocessor Support 000 teens 2 28 2 3 UINISITUPIS asia a RA 2 29 2 9 TMS320C31 Signal Descriptions seiere ranas aie eee 2 30 TMS320C31 Features Performance Comparison cccoococncccnc eee eee 3 1 3 4 TMS320C31 Feature Com
89. en the TMS320C3x performs two or more back to back read cycles on the same memory page one page of memory holds 256 words the default memory bank size for the TMS320C3x SPOX Operating System software is also available for the application board HP 64776 Analysis Subsystem 5 8 HP 64776 Analysis Subsystem Tl and Hewlett Packard jointly designed and developed the HP 64776 Analy sis Subsystem an emulator analyzer for the TMS320C3x see Figure 5 12 For TMS320C31 analysis an adapter is available from HP to use the subsys tem with a surface mounted TMS320C31 The HP 64776 combines with the TI TMS320C3x XDS emulator to yield a complete tool set for integrating hard ware with software producing an extremely powerful debug environment HP s active probe technology yields the maximum electrical and mechanical transparencies improved signal quality and realtime control and debug of the target system at full operating speed The complete analysis subsystem integrates the HP 64776 the TMS320C3x XDS and the C source debugger described in Section 5 2 in a stand alone PC environment The Tl debugger acts as the user interface and communica tions between the subsystem and the PC are handled through an RS 232C connector This powerful system provides software and hardware breakpoint and trace as well as sophisticated bus cycle analysis Figure 5 12 HP 64776 Analysis Subsystem Insert Negative H Get this photo from the TMS320 F
90. ependent fash ion Device drivers are the key to customizing SPOX for a particular sys tem environment and to ensuring portability of SPOX applications from one system to the next Unlike virtually every other operating system or realtime executive the de vice independent I O interface supported by SPOX does not include a read Of write function in the traditional sense Rather than mandat ing one pair of general purpose functions for all input and output SPOX allows for a broader set of I O operations optimized for two fundamentally different forms of program interaction with underlying devices found in realtime DSP systems BI Asynchronous data streaming in which the program and device are in a producer consumer relationship and HM Synchronous message passing in which the program and device are in a client server relationship C Runtime Environment The SPOX application libraries include many of the standard functions which are typically not implemented by C compilers targeted for DSP pro cessors Included among these are the routines comprising the C stdio library together with other standard functions requiring operating system support Opening closing named files open fclose Reading writing byte streams getc pute Formatted l O printf scanf sie Utility functions system time Program termination exit abort Memory management malloc free By furnishing these functions the SP
91. ept to the Final Product To validate this new architecture Doble purchased the Sonitech Spirit 30 de velopment board for the PC Because SPOX had already been ported to the Sonitech board Doble completed a prototype of the new system in two months Doble then ported SPOX to their customer s C30 platform using the SPOX OS component product This effort involved reconfiguring SPOX and writing a few device drivers for data I O and host I O Because the two hard ware platforms had the same SPOX system software almost all of the proto type code was reused in the product While all of Doble s DSP code had been written in assembly language the TMS320C30 was programmed in C using the SPOX realtime kernel and math library Because the SPOX math library had been coded by Spectron in as sembly language the signal processing algorithms ran efficiently using only about 5096 of the C30 cycles This left enough cycles to perform realtime con trol functions and new signal processing algorithms Because the resultant DSP software architecture was more modular new functions could be added or changed easily The multitasking capability of SPOX allowed math functions to run concurrently as the DSP acquired data in realtime and communicated with the PC host Because of the flexibility of the DSP platform Doble planned to provide different services and products to their customers using the same platform Application Examples 4 11 4 12 Chapter 5 De
92. er allows you to dynamically execute most of the Nucleus RTX service calls The reentrant C libraries supplied by Accelerated Technology provide standard ANSII interfaces for all functions with the exception of file ser vices file services are provided by the Nucleus file system Because the library routines are fully reentrant application tasks running under Nucleus can use them Accelerated Technology Inc Nucleus File is an MS DOS compatible file system that is capable of reading and writing standard floppy and hard disk formats Nucleus File is specifically designed for embedded applications Accelerated Technology s realtime software products are primarily written in ANSII C and are optimized for performance on the TMS320C3x DSPs All soft ware products are delivered with complete source code and without any royal ties Features of the Nucleus RTX Realtime Multitasking Executive Realtime multitasking executive for the TMS320C3x DSPs Complete source code No royalties Priority base with optional preemption and time slicing Task communication with user defined public queues Item size of each queue defined by user Optional task suspension on full queues Optional task suspension on multiple queues Task synchronization with event flags Optional consumption of event flags Resource management with semaphores Predictable fixed length memory management Flexible variable length memory management Optional task suspen
93. er systems Figure 1 4 shows the benefits of replacing a controller coprocessor with a TMS320C31 Benefits of a TMS320C31 Based Embedded System Figure 1 4 Benefits of Replacing a Controller Coprocessor With a TMS320C31 Based Embedded System System Peripherals System Peripherals TMS320C31 With OS for System Control and Numeric Processing Numeric RISC CISC Coprocessor With OS for for Realtime System Control Algorithm Execution Replacement Benefits Simplified design Reduced data flow Greater design flexibility Small form factor Lower memory cost Lower device count amp cost Fewer communication bottlenecks Single development environment Introduction 1 9 Chapter 2 TMS320C31 Architectural Overview This chapter provides an architectural overview of the TMS320C31 embedded processor An in depth description of its features can be found in the TMS320C3x User s Guide Topics discussed in this chapter include Topic Page 2 1 EMS320631 Block Diagram e 5e 2 2 2 2 Central Processing Unit CPU eeeeeeeeeeeeee 2 4 2 3 Memory Organization rer s eesse asmia n esen eet ete tei Ie fele Is iiis 2 20 2 4euInternaliBusiOperationiecrer tac tee eter ert D 2 24 2 5 On Chip Peripherals jer mee ea ee eee eere rere retenta 2 25 2 6 Direct Memory Access DMA 00eeeu eee eee eee e eens 2 27 2 7 External Bus Operation cce esse nnn Enen Anana alltel 2 28 219 Interrup
94. esults The eight auxiliary registers support a variety of indirect ad dressing modes and can be used as general purpose 32 bitinteger and logical registers The remaining registers provide system functions such as addres sing stack management processor status interrupts and block repeat The register names and assigned functions are listed in Table 2 1 Following the table the function of each register or group of registers is briefly described CPU Registers Register Assigned Function Name Extended precision register O Extended precision register 1 Extended precision register 2 Extended precision register 3 Extended precision register 4 Extended precision register 5 Extended precision register 6 Extended precision register 7 Auxiliary register O Auxiliary register 1 Auxiliary register 2 Auxiliary register 3 Auxiliary register 4 Auxiliary register 5 Auxiliary register 6 Auxiliary register 7 Data page pointer Index register O Index register 1 Block size System stack pointer Status register CPU DMA interrupt enable CPU interrupt flags I O flags Repeat start address Repeat end address Repeat counter DP IRO IR1 BK SP ST IE IF IOF RS RE RC Central Processing Unit CPU The extended precision registers R7 RO are capable of storing and sup porting operations on 32 bit integer and 40 bit floating point numbers Any instruction that assumes the operands are floating point numbers uses bits 39
95. eted it took just one day to move the SPOX realtime kernel and the recognizer software over to the VPRO 4 hardware Each C31 on the VPRO 4 runs several tasks using the preemptive multitask ing capability of SPOX A high priority task moves time critical voice data to and from the voice bus The bulk of the C31 cycles however are used for speech recognition it runs one recognition task for continuous word input or two recognition tasks for discrete word input There are also background tasks for communicating with the host and other housekeeping functions 4 1 5 A New Level of Interoperability 4 4 VPC s C31 based platform gives them a higher performance system and it lets them serve their customers better Research continues at VPC to improve the recognition algorithms and take advantage of the processing power of the VPRO 4 In some customer applications speech recognition has to be com plemented with other voice functions such as speech synthesis The VPRO 4 makes it easy to port third party voice algorithms to the DSP platform signifi cantly reducing total system costs by removing the need for multiple hardware platforms Other VPC customers have their own C31 SPOX hardware The commonality in the system environment makes it much easier for VPC to port their recognition software to the customer s hardware This level of interoper ability is a significant milestone for speech recognition and signal processing technology Over
96. evelopment environment external memory cost and inte grated peripherals of the TMS320C31 are equivalent to those of 32 bit micro controller solutions At the same time the powerful instruction set and pipe lined CPU provide the system control performance of a RISC processor at a more affordable price But the TMS320C31 is superior to RISC CISC solu tions in numerical performance and emulation capability This best of both worlds feature set delivers many benefits to next generation embedded sys tems With a TMS320C31 many added cost system features become reduced cost features Traditional embedded system architectures use a microcontroller for system control and a coprocessor companion math chip programmable or special purpose DSP or ASIC for math support This traditional system archi tecture has performance and time to market drawbacks because the designer must learn two different architectures and development environments and at tempt to implement efficient communications between different types of pro cessors Today designers are using a C31 to replace microcontrollers for higher performance and to reduce system cost and time to market in dual pro cessor designs Also for even higher performance and homogeneous sys tems multiple C31s can be used The C31 offers numerous advantages for embedded control applications such as voice mail industrial automation instrumentation audio motor control automotive and laser print
97. f applications TI and its third parties offer a wide va riety of software that can be licensed This software covers a range of DSP functionality that includes vocoders speech recognition modems audio cod ers and image coders The software available for license can provide a head start in the development of your final application In addition software applica tions that have been published in TI DSP user s guides and application books are available via the BBS Contact the DSP Hotline for a list of software available for the TMS320C31 5 9 6 Design Workshops Texas Instruments offers a wide array of up to date technical product semi nars and design workshops through its Technical Training Organization TTO to assist designers in developing the skills needed to implement their ideas quickly produce a quality product and shorten time to market Applications as sistance is also offered through local Regional Technology Centers RTCs The DSP design workshops give design engineers hands on experience us ing the latest TMS320 products development tools and design techniques These workshops go beyond the standard lecture format The exercises and lab experiments start with the basics and move quickly into hands on exer cises In these workshops the student learns by doing not just listening or ob Development Support 5 35 TMS320 Technical Support serving The workshops are designed to help customers shorten the design cycle
98. f the C31 s on chip instruction cache The C31 allows the user to implement algorithms using either floating point or integer math while achieving the same performance with either data format C callable optimized DSP algorithms are available for the C31 Code development is not required to build a software monitor for the C31 A target monitor plugs directly into the target system s C31 The C31 has a clear family road map for higher performance with the availability of the C3x and C4x generations of TMS320s Test Equipment Example Using SPOX 4 3 Test Equipment Example Using SPOX Developed by Doble Engineering in the 1930s the Doble test is run routinely by power utility companies to test insulation material used in power substa tions Over time the electrical insulation material can break down and can lead to severe damage to the substation and interruptions to service if the problems go undetected The insulation test procedure involves applying an alternating voltage across the material specimen and a reference sample The electrical current capacitance dielectric loss and power factor across the test speci mens are measured and analyzed in realtime To make the test procedure practical Doble has designed their equipment to be quick and easy to operate and able to make accurate measurements in the presence of a high level of electrical interference Figure 4 3 Doble Test Set Up Ic IT y Ic w
99. ftware and hardware applications are included TMS320 Floating Point DSP Optimizing C Compiler User s Guide literature number SPRUO34 describes the TMS320 floating point C compiler This C compiler accepts ANSI standard C source code and produces TMS320 assembly language source code for the C3x and C4x generations of devices TMS320 Floating Point DSP Assembly Language Tools User s Guide literature number SPRUO35 describes the assembly language tools assembler linker and other tools used to develop assembly language code assembler directives macros common object file format and symbolic debugging directives for the C3x and C4x generations of devices TMS320C3x C Source Debugger User s Guide literature number SPRUO53 tells you how to invoke the C3x emulator evaluation module and simulator versions of the C source debugger interface This book discusses various aspects of the debugger interface including window management command entry code execution data management and breakpoints and includes a tutorial that introduces basic debugger functionality TMS320C30 Hewlett Packard 64776 Analysis Subsystem User s Guide literature number SPRUO71 describes the analysis subsystem which supplements the C30 emulator capabilities by providing realtime breakpoint trace and timing features The analysis subsystem can be used only with the C30 emulator SPARC and S bus are trademarks of Sun Microsystems Inc Spirit 30 is a t
100. functions to be done in parallel every cycle without latency Hence 40 MFLOPS or 40 integer multiply accu mulates operations can be sustained with a 40 MHz TMS320C31 In addition to the integrated math support the CPU architecture provides a high degree of parallelism on chip allowing on and off chip resources to be utilized most effectively Figure 2 2 is a block diagram of the C31 CPU Central Processing Unit CPU Figure 2 2 Central Processing Unit CPU DADD1 b DADD2 b N DDATA BUS N MUX CPU1 CPU2 h REG1 N E B C R R A A D D B E E U G G EH 5 1 1 2 1 2 Extended Precision Registers RO R7 Auxiliary Registers Y ARO AR7 Other Registers 12 gt Disp an 8 bit integer displacement carried in a program control instruction TMS320C31 Device Overview 2 5 Central Processing Unit CPU 2 2 1 CPU Register File Table 2 1 2 6 The TMS320C31 provides 28 registers in a multiport register file that is tightly coupled to the CPU All of these registers can be operated upon by the multipli er and ALU and can be used as general purpose registers However the regis ters also have some special functions For example the eight extended preci sion registers are especially suited for maintaining extended precision float ing point r
101. g a block repeat When the processor is operating in the repeat mode the 32 bit repeat start address register RS contains the starting address of the block of program memory to be repeated and the 32 bit repeat end address register RE contains the ending address of the block to be repeated The program counter PC is a 32 bit register containing the address of the next instruction to be fetched Although the PC is not part of the CPU register file itis a register that can be modified by instructions that modify the program flow 2 2 2 Auxiliary Register Arithmetic Units ARAUs 2 2 3 Multiplier Two auxiliary register arithmetic units ARAUO and ARAU1 can generate two addresses in a single cycle The ARAUs operate in parallel with the multiplier and ALU They support addressing with displacements index registers IRO and IR1 and circular and bit reversed addressing The multiplier performs single cycle multiplications on 24 bit integer and 32 bit floating point values The TMS320C31 implementation of floating point arith metic allows for floating point operations at fixed point speeds via a 50 ns instruction cycle and a high degree of parallelism To gain even higher through put you can use parallel instructions to perform a multiply and ALU operation in a single cycle When the multiplier performs floating point multiplication the inputs are 32 bit floating point numbers and the result is a 40 bit floating point number When
102. gives you the complete picture of the capa bilities performance scalability and ease of use of these realtime kernels RTXC and RTXC MP are available for a one time site license fee All con figurations of processor and compiler bindings include full source code and require no runtime royalties Most compilers are supported The combination of RTXC and RTXC MP address a broad range of ap plications RTXC is aimed at embedded applications which would typical ly use a single TMS320C2x C3x or C5x DSP RTXC MP is targeted at applications employing multiple TMS320C3x or C4x processors RTXC and RTXC MP share many of the same attributes and components Most importantly both kernels use a similar application program interface API However RTXC MP extends the RTXC API to include those func tions which are necessary for the special requirements of the multiproces sing environment The API provides a wide range of kernel services such as task management timer management including timeouts intertask communication and synchronization memory and resource manage ment and processor specific ones Intertask communication can occur via semaphores messages and FIFO queues Because of the com monality of the API software developed for the RTXC single processor system is highly portable to the multiprocessing world of RTXC MP A set of high end utilities help you configure compile and fine tune the ap plication Both kernels use a syste
103. he actual peripheral environment of the tar get system including wait states and access privileges TMS320C3x XDS System Requirements Host IBM PC AT Slot One and one half 16 bit slots Memory Minimum of 640K words Storage One floppy drive and one hard drive Operating System PC MS DOS 2 0 or later version Power Supply Minimum approximately 3 amps 5 volts 150 watts Development Support 5 29 TMS320C3x Application Board With Software Demo 5 7 TMS320C3x Application Board With Software Demo 5 30 Key features of the TMS320C3x application board are L 16Kx32 bit zero wait state full speed SRAM on the primary bus L Two selectable banks of 8Kx32 bit zero wait state full speed SRAM on the expansion bus L TMS320C30 DSP L 512Kx32 bit DRAM user upgradable to 1Mx32 bits The large amount of on board SRAM affords realtime emulation and memory storage flexibility for a variety of algorithms The on board SRAM provides zero wait state access to memory allowing read write in realtime Three types of DRAM cycles are used on the TMS320C3x application board Single word read single word write and page mode read These operations require four two and one wait state per access respectively Note that when you invoke page mode read while accessing the emulator s DRAM fewer wait states are required Page mode DRAM is often used to improve bulk storage performance Page mode read cycles are automatically invoked wh
104. he cost of the speech recognition hardware They could go to faster hardware that would execute multiple recognizers per chip or they could pack more recognizers onto a single ISA board so they could amortize the board and system cost over more recognizers They also wanted this new platform to give them more power and flexibility to handle new algorithms Some of their customers wanted to port different voice functions such as speech synthesis to the VPC hardware plat form To ensure that the hardware platform could be easily reprogrammed VPC wanted to replace their heterogeneous architecture viz 386 and C25 with ahomogeneous multiprocessing architecture which makes it much easi er to partition functions across processors Since the new processor had to take on the functions of both the 386 and C25 the support of a multitasking operating system was important Telecommunications Example Using SPOX The VPC criteria for selecting the processor for their next generation platform were as follows 1 The cost of hardware per recognizer 2 Thenumber of microprocessors viz recognizers they can incorporate on a board 3 Ccompiler and operating system support for pre emptive multitasking and multiprocessing 4 1 3 VPRO 4 A Homogeneous Multi DSP Architecture The new platform VPC developed called the VPRO 4 is an ISA board with four C31s and a shared memory architecture Each C31 has 512K bytes of zero wait state local memory
105. her RAM or ROM This single address space allows you to maximize the use of the memory space and to partition it as desired 2 3 1 RAM ROM and Cache 2 20 Figure 2 3 shows how the memory is organized on the TMS320C31 RAM blocks 0 and 1 are 1K x 32 bits each Each RAM and ROM block is capable of supporting two CPU accesses in either RAM block The C31 also has an on chip bootloader ROM which allows program stored in off chip memory or transferred through the serial port to be loaded anywhere in the memory map The separate program buses data buses and DMA buses allow parallel pro gram fetches data reads and writes and DMA operations For example the CPU can access a data value in one RAM block and perform an external pro gram fetch in parallel with the DMA loading another RAM block all within a single cycle A 64 x 32 bit instruction cache is provided to store frequent sections of code thus greatly reducing the number of off chip accesses necessary This allows code to be stored off chip in slower lower cost memories The external buses are also freed for use by the DMA external memory fetches or other devices in the system Memory Organization Figure 2 3 Memory Organization PDATA Bus PADDR Bus RDY gt T E Y y HOLD DDATA Bus HOLDA M I M STRB tt U DADDR1 Bus U Rw 34 X X
106. high IDLE Idle until interrupt PC 1 gt PC ldle until next interrupt Por eeseeranek 7 fe 8 RETIcond Return from interrupt conditionally If cond true or missing SP PC 1 gt ST GIE Else continue RETScond Return from subroutine conditionally If cond true or missing SP gt PC Else continue Signal interlocked Signal interlocked operation Wait for interlock acknowledge Clear interlock Software interrupt Perform emulator interrupt sequence TRAPcond Trap conditionally If cond true or missing Next PC SP Trap vector N gt PC 0 ST GIE Else continue LEGEND src general addressing modes Dreg register address any register srci three operand addressing modes Rn register address R7 RO src2 three operand addressing modes Daddr destination memory address Csrc conditional branch addressing modes ARn auxiliary register n AR7 ARO Sreg register address any register addr 24 bit immediate address label count shift value general addressing modes cond condition code see Chapter 11 SP Stack pointer ST Status register GIE global interrupt enable register RE repeat interrupt register RM repeat mode bit RS repeat start register TOS top of stack PC program counter C Carry bit 2 12 Central Processing Unit CPU Table 2 4 Program Flow Control Instruction Summary METEO IT Operaion Branch conditionally standard If cond true
107. ils on Signal Processing Newsletter The TMS320 newsletter Details on Signal Processing is published quarterly to update TMS320 customers on product information and industry trends It covers TMS320 products documentation third party support application boards mini application reports development tool updates contacts for sup port design workshops seminars conferences and the TMS320 university program To be added to the mailing list call the Customer Response Center 214 995 6611 5 9 3 TMS320 Bulletin Board Service The TMS320 Bulletin Board Service BBS is a telephone line computer bulle tin board that provides access to information about the TMS320 family The BBS is an excellent means of communicating specification updates for current or new TMS320 application reports as they become available It also serves as a means to trade programs with other TMS320 users The BBS contains TMS320 source code from the more than 2000 pages of application reports written to date These programs include macro definitions FFT algorithms filter programs ADPCM algorithms echo cancellation graph ics control companding routines and sine wave generators You can access BBS with a terminal or PC and a modem The modem must be able to communicate at a data rate of either 300 1200 2400 or 9600 bps Acharacter length of eight bits is required with one stop bit and no parity The telephone number of the bulletin board is 713 274 232
108. ing function arguments Parallel instructions Conditional instructions Loop unrolling 5 1 1 1 General Purpose Optimizations Algebraic Reordering Symbolic Simplification Constant Folding For optimal evaluation the compiler simplifies expressions into equivalent forms requiring fewer instructions or registers For example the expression a b c d requires more instructions and registers to evaluate than the equivalent expression a b c d Operations between constants are folded into single constants For example a z b 4 c 1 becomes a b c 3 See Figure 5 1 Alias Disambiguation Programs written in C generally use many pointer variables Frequently com pilers are unable to determine whether or not two or more lower case L val ues symbols pointer references or structure references refer to the same memory location This aliasing of memory locations often prevents the compil er from retaining values in registers because it cannot be sure that the register and memory continue to hold the same values over time Alias disambiguation is a technique that determines when two pointer expressions cannot point to the same location allowing the compiler to freely optimize such expressions Data Flow Optimizations Collectively the following three data flow optimizations replace expressions with less costly ones detect and remove unnecessary assignments and avoid TMS320C3x Optimizing ANSI C
109. ing system Small 3 4 length PC AT board format Boot EPROM for standalone operation Zero wait state SRAM up to 640K words Dual port SRAM host interface High quality on board analog interfaces Uprated DSPLINK parallel bus expansion Comprehensive software support The board format has been designed to the familiar PC AT specification to ease initial evaluation and development work Existing users of LSI TMS320C30 products can quickly transfer code to the PC C31 to imple ment a target system The 3 4 length board format aids in keeping occu pied space to a minimum True standalone operation is achieved by the use of the boot EPROM Using the built in boot loader of the C31 the board can be configured to self initialize and begin execution of applica tions The wide range of zero wait state SRAM options from 32K to 640K words allows any size of system to be specifically configured for the re quired application From an intelligent microcontroller in industrial use to a multitasking signal processing design all can be accommodated in a high speed solution The 2K word dual port memory host interface allows rapid Third Party Product Descriptions 6 17 Loughborough Sound Images Ltd 6 18 communication allowing a host PC to transfer data to and from the PC C31 without halting the DSP This facility is a great asset in systems that use both the DSP and the host machine in a dual processing arrange ment where efficient communication be
110. instruction set summa ry 2 14 long immediate addressing modes 2 9 loop code motion 5 7 induction variable optimizations 5 7 rotation 5 7 unrolling 5 14 Loughborough Sound Images Ltd F 17 F 19 macro archiver 5 20 library 5 19 object format converter 5 20 memory 2 20 cache 2 20 general organization 2 20 memory maps 2 22 multi DSP architecture 4 3 multiplier 2 8 multitasking realtime F 25 newsletter 5 34 Nicolet Instruments 4 5 object format converter 5 20 optimizations branch 5 6 data flow 5 4 Index 3 Index optimizations continued fixed point 5 3 floating point 5 3 5 4 5 10 loop induction variable 5 7 optimizing ANSI C compiler optimizations 5 3 OSPA See Open Signal Processing Architecture parallel instruction set summary 2 18 parallel addressing modes 2 9 parallel instructions 5 13 part numbers breakdown of numbers B 5 prefix designators B 4 part ordering B 1 B 6 performance TMS320C31 1 3 peripheral bus 2 25 general architecture 2 25 peripherals on serial port 2 26 timers 2 26 register diagram 2 25 pipelined CPU 1 3 Precise Software Technologies Inc F 19 F 22 preview bulletins 5 33 processor evaluation example 4 5 4 8 product bulletins 5 33 program buses 2 24 program counter PC 2 24 program flow control instruction set summary 2 13 RAM 2 20 See also memory realtime multitasking F 25 realtime recognition 4 4 recog
111. int a b c 10 initstr struct s ps S blkcpy char ps t 12 TMS320C31 compiler output is initstr R2 assigned to variable AR2 assigned to variable AR4 assigned to variable BK assigned to variable RC assigned to variable L 1 S LDI LDI LDI RPTS TI LDI S TI BK AR2 R2 AR4 AR4 RO 10 RO AR2 AR4 RO RO AR2 mee while char from int n n char t 12 blkcpy 1 to blkcpy 1 from pst blkcpy_l_to ps blkcpy_1_from t t expansion of blkcpy copy 12 words The special in line declaration of blkcpy results in the call being replaced with the function s body The compiler creates temporary variables blkcpy 1 to and blkcpy 1 from corre sponding to the parameters of blkcpy Often copy propagation can eliminate assignments to such variables when the argument expressions are not reused after the call Development Support 5 9 TMS320C3x Optimizing ANSI C Compilers 5 1 1 2 Optimizations Specific to the TMS320C31 Compiler Register Variables The compiler helps maximize the use of registers for storing local variables parameters and temporary values Variables stored in registers can be ac cessed more efficiently than variables in memory This optimization is particu larly effective for pointers that arise when array index constructs are turned into loop induction variables See Figure 5 4 and Figur
112. istics High overall performance for peripheral device management and data flow control Software compatibility efficient compilers and realtime operating system support Mature development tools and third party support Flexibility Reliability Availability Low device price Low system cost OOCL Uo The TMS320C31 s ability to satisfy these needs makes it an excellent choice when compared to embedded RISC and high end CISC embedded control lers The TMS320C31 Provides a low cost solution Supports a general purpose programming model Supports efficient C language compilation Enables high performance system control Supports coprocessor math performance on chip Integrates system peripherals on chip Allows fast context switching L O O C L L L The TMS320C31 is an embedded controller with dedicated digital signal pro cessing support that provides low cost high performance system integration and ease of use Due to these cost and performance advantages the TMS320C31 is displacing RISC and high end CISC processors in a wide range of applications across many industries TMS320C31 Key Features 1 2 TMS320C31 Key Features The TMS320C31 includes the features normally associated with a general purpose embedded controller so designing with it is very similar to designing with RISC or CISC devices But the C31 is distinguished by many high perfor mance features not found on processors in its price range High
113. ity the C31 especially excels in operations that require multiplies in the inner loop In addition to the on chip hardware math support the C31 per forms the waveform calculations quickly due to its 2K words of on chip gener al purpose memory and on chip program cache 4 2 4 Fast Fourier Transform The requirements for the floating point FFT are similar to those for waveform processing The processor must perform a 1K FFT fast enough to allow 10 screen updates second The C31 FFT performance far exceeded the user update requirement And if greater FFT performance was needed Nicolet ob served that they could use the C callable hand optimized assembly language FFT routines available from Texas Instruments This is not an option with many RISC processors Application Examples 4 7 Instrumentation Application and Processor Evaluation Example 4 2 5 Advantages of a TMS320C31 System 4 8 Nicolet explained their choice of a C31 as the embedded processor with the following comments 1 2 The C31 offers a good balance of data movement and numeric perfor mance for the price The C31 s performance is on par with more expensive processors mak ing many of the extra cost product options either no cost options or extra margin options The C31 is very efficient at accessing arrays of data due to its ability to do auto increment indirect addressing The majority of their code consists of small loops which makes good use o
114. ket A 2 Electronic Tools GmbH F 14 embedded systems 1 1 embedded controller requirements 1 2 embedded systems block diagram 2 2 emulator analysis subsystem 5 31 HP 64776 5 31 scan based 5 26 TMS320C3x Target Board 5 30 XDS 5 27 XDS tools 5 26 EPROM programmer Loughborough Sound Images Ltd F 17 evaluation module EVM introduction 5 24 external buses expansion primary 2 28 external interrupts 2 29 FAX services 5 34 FFT 4 6 4 7 floating point compiler optimizations 5 4 general addressing modes 2 9 high level language compiler Loughborough Sound Images Ltd F 17 Tartan Laboratories Inc F 33 hotline 5 34 HP 64776 Analysis Subsystem 5 31 indirect addressing 2 10 2 11 inline expansion 5 7 instruction register IR 2 24 instruction set TMS320C31 1 3 instruction set summary 2 11 2 20 arithmetic 2 16 2 17 load and store 2 15 logical and bit manipulation 2 14 parallel 2 18 2 30 program flow control 2 13 instructions conditional 5 13 delayed 5 12 parallel 5 13 repeat blocks 5 11 instrumentation application example 4 5 Integrated Motion Incorporated F 15 integrated peripherals TMS320C31 1 3 interface subsystem 5 31 Index interfaces expansion bus 2 28 primary bus 2 28 internal bus 2 24 interrupts 2 29 linker TMS320 5 19 literature 5 33 load and store instruction set summary 2 15 logic analyzer F 37 logical and bit manipulation
115. kpoint is reached the program halts execution At this point the status of the registers and of the CPU is available Their contents are visible in the appropriate windows to view the contents of other memory locations only one command is required Software trace lets you view the state of the TMS320C3x when a breakpoint is reached This information can be saved in a file for future analysis Software timing allows you to track the clock cycles between breakpoints for bench marking of time critical code Single step execution gives you the capability to step through the program one instruction at a time After each instruction the status of the registers and CPU are displayed This provides greater flexibility during software debug and helps reduce the development time Object code can be downloaded to any valid TMS320C3x memory location program or data via the scan path interface Downloading a 1K byte object program typically takes 100 ms In addition by inspecting and modifying the registers while single stepping through a program you can examine and modify program code or parameters The emulator s configurability gives your system flexibility You can configure both memory and screen color The address range memory type and access TMS320C3x Emulator type assigned to each location can also be configured The memory map which may include EPROM SRAM DRAM and on chip memory and periph erals can be configured to reflect t
116. lington Heights IL 60005 708 640 2909 DALLAS Texas Instruments 7839 Churchill Way Park Central V MS 3984 Dallas TX 75251 214 917 3881 INDIANAPOLIS Texas Instruments 550 Congressional Blvd Suite 100 Carmel IN 46032 317 573 6400 NORTHERN CALIFORNIA Texas Instruments 5353 Betsy Ross Drive Santa Clara CA 95054 708 748 2220 SOUTHERN CALIFORNIA Texas Instruments 1920 Main St Suite 900 Irvine CA 92714 714 660 8140 OTTAWA Texas Instruments Canada Ltd 301 Moodie Drive Suite 102 Nepean Ontario Canada K2H 9C4 613 726 1970 MEXICO CITY Texas Instruments de Mexico Alfonso Reyes 115 Col Hipodromo Condesa Mexico D F Mexico 06170 52 5 515 6081 52 5 515 6249 International Locations AUSTRALIA Texas Instruments Australia Ltd 6 10 Talavera Road North Ryde New South Wales Australia 2113 Tel 61 2 8789000 JAPAN Tokyo Texas Instruments Japan Ltd Ms Shibaura Building 9F 4 13 23 Shibaura Minato Ku Tokyo JAPAN 108 Tel 81 3 3769 8700 Development Support 5 39 RTC Locations Table 5 1 RTC Worldwide Locations Concludea International Locations 5 40 BRAZIL Texas Instruments Electronicos do Brasil Ltda Av Eng Luiz Carlos Berrini 1461 110 andar 04571 Sao Paulo SP Brazil Tel 55 11 535 5133 FEDERAL REPUBLIC OF GERMANY Texas Instruments Deutschland GMBH Kirchhorster Strasse 2 3000 Hannover 51 FR Germany Tel 49 511 64802
117. logy Tl provided students and professors at more than 200 universities with resources to study the technology and offer suggestions for improve ments and new applications University work along with efforts of third party developers helped define new applications far beyond the niche markets of the early 80s A broad application base led to significant cost reductions by 1987 because the higher volume enabled more efficiencies through mass production Conti nous advances in fabrication process technology contributed to low cost mass production and enabled TI to incorporate numerous functions on a single DSP As the number of functions performed by a single processor increased prod ucts could be designed to be lightweight and portable which made the DSP appeal to a growing number of consumer OEMs Texas Instruments world class development support led to shorter design cycles and contributed to the progress in customer product technologies The market exploded Today more than 10 000 designers have gained the benefits that TMS320 DSPs bring to applications More than 100 independent software and hard ware third parties support the development of products incorporating TI DSPs Tl also offers seminars and workshops on product applications and assists po tential customers who want to incorporate DSPs in their products Tl is firmly committed to the future of DSP and will continue to develop new devices and applications that will drive techno
118. logy into the next century TMS320 DSP Family A 3 The TMS320 Product Roadmap A 3 The TMS320 Product Roadmap A 4 The TMS320 family of 16 32 bit single chip digital signal processors com bines the flexibility of a high speed controller with the numerical capability of an array processor offering an inexpensive alternative to microcontrollers custom VLSI and bit slice processors The combination of the TMS320 s high degree of parallelism and its special ized digital signal processing instruction set provide speed and flexibility to produce a CMOS microprocessor family that is capable of executing up to 50 MFLOPS or 275 MOPS The TMS320 family optimizes speed by implement ing functions in hardware that other processors implement through software or microcode This hardware intensive approach provides the design engi neer with power previously unavailable on a single chip The newest TI gener ation of floating point DSPs TMS320C4x is designed for high perfor mance parallel processing applications The TMS320 family consists of five generations three fixed point and two floa ting point of digital signal processors The fixed point devices are members of the TMS320C1x TMS320C2x or TMS320C5x generation and the floating point devices belong to the TMS320C3x or TMS320C4x generation Figure A 1 shows the TMS320 family Table A 1 provides a tabulated over view of each member s memory capacity number of I O ports by type
119. lowing the user to debug code in C assembly or both Key features of the TMS320C3x software simulator include Execution of user oriented DSP programs on a host computer Inspection and modification of registers L Data and program memory modification and display H Modification of an entire block at any time Bi Initialization of memory before a program is loaded L Simulation of peripherals caches and pipelined timings O Extraction of instruction cycle timing for device performance analysis Programmable breakpoints on BI Instruction acquisition E Memory reads and writes data or program Bl Data patterns on the data bus or the program bus E Error conditions Trace on Bl Accumulator HM Program counter B Auxiliary registers Single stepping of instructions J Interrupt generation at user specified intervals Development Support 5 21 TMS320C3x Software Simulator 5 22 LI J m LJ Error messages for E legal opcodes Bi invalid data entries Execution of commands from a journal file A branch to self is detected Execution is halted Once program execution is suspended the internal registers and both pro gram and data memories can be inspected and or modified The trace memory can also be displayed A record of the simulation session can be maintained in a journal file so that it can be re executed to regain the same machine state during another simulation session
120. m J Simulation of the TMS320C31 s entire instruction set Simulation of the TMS320C31 peripheral s key features Command entry from either menu driven keystrokes menu mode or line mode Help menus for all screen displayed modes Interface that can be user customized Simulation parameters quickly stored retrieved from files to facilitate prep aration for individual sessions Reverse assembly for editing and reassembling source statements Memory that can be displayed at the same time as M Hexadecimal 32 bit values mM Assembled source TMS320C3x Software Simulator Execution modes E Single multiple instruction count Single multiple cycle count Until condition is met While condition exists For set loop count BI Unrestricted run with halt by keyed input Trace execution with display choices E Designated expression values MW Cache memory E Instruction pipeline L Simulation of cache utilization Cycle counting Ml Display of the number of clock cycles in a single step operation or in the run mode Externally generated mode that can be configured with wait states for accurate cycle counting The simulator lets you verify and monitorthe state ofthe processor Simulation speed can be either thousands of instructions per second VAX VMS and SUN 3 UNIX or hundreds of instructions per second PC DOS MS DOS The TMS320C31 simulator is available for the IBM PC DOS MS DOS 5 25 inch floppy the VAX VMS i
121. m generation utility RTXCgen which permits interactive definition of the system components tasks queues semaphores memory partitions and mailboxes RTXCgen maintains the TMS320C31 Third Party Support 6 5 A T Barrett amp Associates Inc 6 6 user defined list of all application or topology dependent attributes For example resizing of a memory partition requires only the regeneration of the C source file for memory partitions and no changes in the application source code RTXCgen automatically monitors changes made to the sys tem component definitions When directed to generate C source code for system tables RTXCgen also produces header files only for those system components that have been changed Thus RTXCgen promotes concor dance between the source code representing the specified components of the application and the header files used for referencing members of that application In addition RTXCgen provides listings of all system com ponents that serve as a primary source for system level documentation A system level debug utility RTXCbug is also common to both kernels RTXCbug examines the current state of the tasks queues and sema phores and presents acoherent picture or snapshot of the interaction be tween the system and the application tasks It even permits manual task management RTXC MP includes two special utilities not found in the single processor RTXC kernel RTXC monitors the system and provides on
122. n approach to speech recognition that is particularly adept for handling voices over the telephone Telephone transactions is one area in which speech recognition technology has a compelling market need VPC has been supplying speech recognition technology to telecom system manufacturers and over the phone service providers for several years allow ing these firms to replace human operators VPC recognizers are being used in a wide array of applications such as credit card verification operator inter cept telephone order entry and voice mail 4 1 2 Lower Cost and More Recognizers 4 2 The VPC recognition software requires a high performance platform that can execute both signal processing and general purpose algorithms Since such hardware platforms did not exist on the market in 1989 VPC developed and built an ISA board with two different processors the Intel i386 microprocessor and Texas Instruments TMS320C25 signal processor All of the cycles of the ISA board were needed to execute one speaker independent speech recog nizer in realtime Since 1989 as their customers required more and more lines of speech recognition to automate over the phone services VPC needed a new hardware platform that could provide more lines of recognizers at a lower cost per line VPC also needed a more powerful hardware platform to run new recognition algorithms being developed in their research lab As VPC engineers saw it there were two ways to reduce t
123. n backup format on 1600 bpi magnetic tape and the SUN 3 4 UNIX in TAR format on 1600 bpi magnetic tape oper ating systems The PC configuration requires a minimum of 512K bytes for the TMS320C31 simulator Development Support 5 23 TMS320C3x Evaluation Module 5 5 TMS320C3x Evaluation Module The TMS320C3x evaluation module EVM is a low cost development board used for device evaluation benchmarking and limited system debug The TMS320C3x EVM see Figure 5 10 eliminates the cost barrier to evaluating and developing embedded systems based on the TMS320C31 Features include L Assembler L On board memory L Host upload download capabilities L l O capability Figure 5 10 TMS320C3x EVM 5 24 Insert Photo D Get this photo from the TMS320 Family Development Support Reference Guide job 61136 page 5 16 Figure 5 4 The TMS320C3x EVM enables you to benchmark and evaluate code in real time while the device is operating at 30 MHz in the rich development environ ment of the TMS320C3x assembler linker and C assembly source debugger interface Applications can be benchmarked and tested easily with the analog ready interface TMS320C3x Evaluation Module The TMS320C3x EVM comes complete with a PC half card and software package The EVM board contains J J One TMS320C30 a 33 MFLOP 32 bitprocessor TMS320C31 applica tions can be developed by using only those C30 features available on a C31
124. nal This pin indicates that the external de vice is prepared for a transaction completion HOLD 1 Hold signal When HOLD is a logic low any ongoing transaction is completed The A23 A0 D31 DO STRB and R W signals are placed in a high impe dance state and all transactions over the primary bus interface are held until HOLD becomes a logic high or the NOHOLD bit of the primary bus control register is set HOLDA Hold acknowledge signal This signal is generated in response to a logic low on HOLD It signals that A23 AO D31 D0 STRB and R W are placed in a high im pedance state and that all transactions over the bus will be held HOLDA will be high in response to a logic high of HOLD or the NOHOLD bit of the primary bus control register is set Input 1 output O high impedance Z state S SHZ active H Hold active R E active 2 30 TMS320C31 Signal Descriptions Table 2 9 TMS320C31 Signal Descriptions Continued 1 0 Zt Description Condition When Signal Is in High Z Control Signals 10 Pins RESET 1 Reset When this pin is a logic low the device is placed in the reset condition When reset becomes a logic 1 execution begins from the location specified by the re set vector IACK 1 O Z Interrupt acknowledge signal IACK is active during the S IACK instruction This can be used to indicate the be ginning or end of an interrupt service routine MCBL MP Ea ae Microcomputer boot loader microproces
125. nalyzes code in order to optimize the usage of memory and register variables L ANSI standard runtime support library ROM able relocatable and re entrant code LI The ability to link C programs with assembly language routines allowing hand coding of time critical functions in assembly language A full featured flexible linker that allows total control over memory alloca tion memory configuration and partial linking and contains features that allow easy runtime relocation of code AC shell program that facilitates one step translation from C source to executable code Fast compilation to increase productivity Unlimited symbol table space up to the amount of available host memory L Complete and useful diagnostics error messages TMS320C3x Optimizing ANSI C Compilers An archiver utility that allows you to collect files into a single archive file or library by adding new files or by extracting deleting or replacing files You can use a library of object files as input to the linker LI L Ability to expand in line both runtime support and user defined functions A utility that builds object libraries from source libraries A variety of listing files including Assembly source file which can optionally include interlisted C source code as well as register usage information Preprocessed output file useful for separating preprocessing parsing if memory limitations dictate
126. ng is being performed and that the label or expression following the at character is used to form the data address Here is an example ADDI OBCDEnh R7 In this instruction the data address is formed by concatenating OBCDEh with the current value of the data page pointer The contents of this location is added to R7 and stored in R7 Braces and indicate a list The symbol read as or separates items within the list Here s an example of a list ERA AA This provides three choices or Unless the list is enclosed in square brackets you must choose one item from the list Some directives can have a varying number of parameters For example the byte directive can have up to 100 parameters The syntax for this directive is byte value values This syntax shows that byte must have at least one value parameter but you have the option of supplying additional value parameters separated by commas Read This First V Related Documentation from Texas Instruments Trademarks Related Documentation From Texas Instruments Trademarks vi TMS320C3x User s Guide literature number SPRUO31 describes the C3x C30 and C31 32 bit floating point microprocessors developed for digital signal processing as well as embedded control applications Covered are its architecture internal register structure instruction set pipeline specifications and operation of its DMA and its two serial ports So
127. nications and pe ripheral controllers Clients and servers are Precise MPX tasks The only difference is that a server is created with the Server create primitive and after itis created it initializes itself differently Part of this TMS320C31 Third Party Support 6 21 Precise Software Technologies Inc 6 22 initialization is registering the Server s service with a registry so that any client task can use the server VO Components The Precise MPX is augmented with optional I O software components that support the following services BH SDLC E LAPB E Mil Std 1553 B TCP IP These components are written almost entirely in C and are completely re usable for any new hardware configuration Multiprocessing The Precise MPX kernel has been designed to support various commonly used multiprocessor hardware configurations It is a unique technology due to the support for multiprocessor applications using DSPs or mixes of DSP and non DSP processors Precise MPX has been successfully used on multiprocessors based upon VMEbus and NuBus hardware consisting of from two to 20 microproces sors and using the parallel backplane as a high speed interconnection net work It has also been used in proprietary hardware applications where from three to nine microprocessors are interconnected with memory or high speed serial data interfaces In all cases the applications software has been designed independently of the underlying hardware o
128. nizers 4 2 redundant elimination 5 5 reference guides 5 33 Regional Technology Center locations 5 39 services 5 37 Index 4 register allocation 5 11 targeting 5 10 tracking floating point 5 10 variables fixed point 5 10 floating point 5 10 register buses 2 24 register based CPU 1 3 registers 2 6 auxiliary ARO AR7 2 7 block size BK 2 7 data page pointer 2 7 extended precision RO R7 2 7 I O flags IOF 2 7 interrupt enable IE 2 7 interrupt flag IF 2 7 program counter PC 2 8 2 24 repeat count RC 2 8 repeat end address RE 2 8 repeat start address RS 2 8 status register ST 2 7 system stack pointer SP 2 7 registers general see also CPU registers 2 6 repeat blocks 5 11 ROM 2 20 See also memory RTC See Regional Technology Center scan based emulators 5 26 seminars 5 35 simulator Loughborough Sound Images Ltd F 17 overview 5 21 TMS320C3x 5 22 software development 1 6 Spectron Microsystems Inc F 23 F 30 Spectrum Signal Processing Inc F 31 speech recognition 4 2 SPOX 4 2 4 10 4 11 architecture F 23 F 24 C runtime environment F 26 debug support F 28 products F 28 subexpression elimination 5 5 symbolic simplification 5 4 system control instruction summary 2 12 Tartan Inc F 33 F 36 Tartan Laboratories Inc F 33 technical assistance 5 34 Technical Training Organization Applications in C Design workshop 5 37 Digital Control Design
129. nterlocked operations through its XFO and XF1 pins and dedicated interlocked operation instructions XFO and XF1 can also be used as bit I O signals 2 8 Interrupts Interrupts The TMS320C31 supports four external interrupts INT3 INTO a number of internal peripheral interrupts 28 software interrupts traps and a nonmask able external RESET signal The external and internal peripheral interrupts can be used to interrupt either the DMA or the CPU When the CPU responds to the interrupt the IACK pin can be used to signal an external interrupt ac knowledge Typical interrupt latency times are less than 1 us for a 50 ns TMS320C31 TMS320C31 Architectural Overview 2 29 TMS320C31 Signal Descriptions 2 9 TMS320C31 Signal Descriptions Table 2 9 describes the external signals of the TMS320C31 They are listed according to the signal name the number of pins allocated the input I output O or high impedance state Z operating modes a brief description of the signal s function and the condition that places an output pin in high imped ance Aline over a signal name for example RESET indicates that the sig nal is active low true at a logic O level Table 2 9 TMS320C31 Signal Descriptions 1 0 Zt Description Condition When Signal Is in High Z Primary Bus Interface 61 Pins Read write signal This pin is high when a read is per formed low when a write is performed over the parallel interface Ready sig
130. nual with many examples Prototype and test TMS320 BOS applications on a PC Source code site license unlimited product usage No royalty executable code distribution One year of technical support and revision updates BOS is optimized for all TMS320 DSPs and has excellent performance BOS is configured to work with the Texas Instruments C development sys tems and includes a working application Computer Motion Inc 6 5 Computer Motion Inc 270 Storke Rd Suite 11 Goleta CA 93117 805 685 3729 FAX 805 685 9277 C Compiler Computer Motion Inc has introduced object oriented programming using C for the TI TMS320C30 and TMS320C31 DSPs This compiler is based on the GNU C retargetable compiler and executes on SPARCstation platforms This compiler translates programs directly to TMS320 assembly language The TI assembler and linker can then be used to create the final executable code The object code generated from the assembly language output can be linked with other programs compiled with both the TI C compiler and the runtime support libraries The package includes documentaiton manuals and a quarter inch cartridge tape that contains both a C and a C compiler TMS320C31 Third Party Support 6 13 Electronic Tools GmbH 6 6 Electronic Tools GmbH 6 14 Zum Blauen See7 4030 Ratingen Germany 0049 2102 88010 FAX 0049 2102 880123 miniKit 320C31 Embedded DSP System miniKit 320C31 is a complete embedded D
131. om chips and bit slice processors They quickly won acceptance in high performance applications such as military systems High volume applications such as modems soon followed as the cost of TI DSPs declined dramatically A processor costing 500 in 1982 now costs 5 quanti ty 1 and as little as 3 in volume Similar price reductions will transform for mer niche applications such as multimedia into a wdespread standard in the near future In addition to lower prices improvements in ease of use and increased sys tem integration have enabled DSPs to displace traditional microcontrollers in many applications As systems become more numeric intensive the DSP al ternative is increasingly attractive Evidence of this trend can be seen in semi conductor manufacturers attempts to incorporate DSP like functionality into traditional controllers DSPs are clearly moving into the mainstream The evidence suggests that DSPs will be to the 1990s what general purpose microprocessors were to the 1970s and 1980s The TI Role in the DSP Industry A 2 The TI Role in the DSP Industry Advanced technology products and extensive development support have made Texas Instruments a dominant force in the DSP industry Tl has played a vital role in educating new users and has made a substantial investment in new product development since patenting their first digital signal processor in 1982 In a dedicated effort to train upcoming designers in DSP techno
132. on Variable Elimination Register Variables and Loop Test Replacement for Floating Point Compilers float a 10 b 10 scale float k int i for i O i lt 10 i b i k TMS320C31 compiler output is _scale LDI RCONST 0 ARA ARA amp a 0 LDI CONST 1 AR5 AR5 b 0 MPYF R4 AR5 RO compute first product RPTS 8 loop for next 9 RO AR4 Store this product R4 AR5 RO and compute next RO AR4 store last product This process shows general and floating point specific optimizations working together to generate highly efficient code Induction variable elimination and loop test replacement allow the compiler to recognize the loop as a simple counting loop and then generate a repeat block Strength reduction turns the array s references into efficient pointer autoincrements The compiler unrolls the loop once to separate the first multiply and last store allowing the body of the loop to be written as a single parallel instruction Delayed Instructions The TMS320C31 compiler supports delayed branch instructions that can be inserted three instructions early in an instruction stream avoiding costly pipe line flushes associated with normal branches The compiler uses uncondition al delayed branches wherever possible and conditional delayed branches for counting loops See Figure 5 6 5 12 TMS320C3x Optimizing ANSI C Compilers Figure 5 6 TMS320C31 Compiler Delayed
133. operations set the condition flags of the status register accord ing to whether the result is zero negative etc This includes register load and store operations as well as arithmetic and logical functions When the status register is loaded however a bit for bit replacement is performed with the con tents of the source operand regardless of the state of any bits in the source operand Therefore following a load the contents of the status register are equal to the contents of the source operand This allows the status register to be easily saved and restored The CPU DMA interrupt enable register IE is a 32 bit register The CPU interrupt enable bits are in locations 10 0 The DMA interrupt enable bits are in locations 26 16 A 1 in a CPU DMA interrupt enable register bit enables the corresponding interrupt A O disables the corresponding interrupt The CPU interrupt flag register IF is also a 32 bit register A 1 in a CPU in terrupt flag register bit indicates that the corresponding interruptis set A 0 indi cates that the corresponding interrupt is not set The I O flags register IOF controls the function of the dedicated external pins XFO and XF1 These pins may be configured for input or output and may also be read from and written to TMS320C31 Device Overview 2 7 Central Processing Unit CPU The repeat counter RC is a 32 bit register used to specify the number of times a block of code is to be repeated when performin
134. opment Support Discussion of code generation debug and system integration development flow Summarizes features of Texas Instruments simulation and emulation development tools and describes available technical documentation and technical assistance TMS320C31 Third Party Support Alphabetical listing of third party manufacturers and suppliers who provide development support products for the TMS320C31 and description of their products TMS320 DSP Family Description of DSP market Tl s role in the DSP industry TMS320 product roadmap and the five generations of TMS320 devices Part Ordering Information Listings of the hardware and software available from Texas Instruments to support the TMS320C31 device Read This First iii Style and Symbol Conventions Style and Symbol Conventions This document uses the following conventions LI Program listings program examples interactive displays filenames and symbol names are shown in a special typeface similar to a typewriter s Examples use a bold version ofthe special typeface for emphasis interactive displays use a bold version of the special typeface to distinguish commands that you enter from items that the system displays such as prompts command output error messages etc Here is a sample program listing 0011 0005 0001 field l4 2 0012 0005 0003 field 3 4 0013 0005 0006 field B4 3 0014 0006 even Here is an example of a system prompt and a command that y
135. opment Support 5 13 TMS320C3x Optimizing ANSI C Compilers Loop Unrolling When the compiler can determine that a short loop is executed a low constant number of times it replicates the body of the loop rather than generating the loop note that low and short are subjective judgments made by the compiler This avoids any branches or use of the repeat registers See Figure 5 7 Figure 5 7 Loop Unrolling add3 int a 3 int i sum 0 for i 0 i lt 3 i sum return sum TMS320C31 compiler output is _add3 LDI FP 2 ARA LDI AR4 RC ADDI AR4 RC ADDI AR4 RC LDI RC RO The compiler determines that this loop is short enough to unroll resulting in a simple 3 instruction sequence and no branches 5 14 TMS320 Programmer s Interface C Assembly Source Debugger 5 2 TMS320 Programmer s Interface C Assembly Source Debugger The TMS320 Programmer s Interface brings new levels of power and flexibility to embedded systems development The interface debugger is now available on virtually all TMS320 development tools so moving to another tool or anoth er generation of processor is greatly simplified The debugger is an advanced software interface that runs on a PC and sup ports Tl s unique scan based realtime TMS320C3x XDS emulator The de bugger provides complete control over programs written in C or assembly lan guage The debugger improves productivity by enabling
136. or branches See Figure 5 2 Loop Induction Variable Optimizations Strength Reduction Loop induction variables are variables whose value within a loop is directly re lated to the number of executions of the loop Array indices and control vari ables of FOR loops are very often induction variables Strength reduction is the process of replacing costly expressions involving induction variables with more efficient expressions For example code that indexes into a sequence of array elements is replaced with code that increments a pointer through the array Loops controlled by incrementing a counter are written as repeat blocks or by using efficient decrement and branch instructions Induction variable analysis and strength reduction together often remove all references to the programmer s loop control variable allowing it to be eliminated entirely Loop Rotation The compiler evaluates loop conditionals at the bottom of loops saving a cost ly extra branch out of the loop In many cases the initial entry conditional check and the branch are optimized out Loop Invariant Code Motion This optimization identifies expressions within loops that always compute the same value The computation is moved in front of the loop and each occur rence of the expression in the loop is replaced by a reference to the precom puted value In Line Expansion of Function Calls The special keyword inline directs the compiler to replace calls to a function
137. ords Multiple DSP programs on a single chip General purpose and DSP specific instructions Ease of design EPROM and OTP versions Fast time to market High level language support Operating system support Extensive development support JTAG IEEE test bus System reliability Serial scan path for 99 fault grading A 8 A 4 TMS320C1x TMS320C 1x The TMS320C1x DSPs provide cost effective solutions for many needs TMS320C1x DSPs perform a multiply command at least 30 times faster than a general purpose microprocessor An on chip hardware multiplier allows the TMS320C1x to produce results in a single instruction cycle Instruction cycle times range from 160 to 280 ns Higher performance is achieved through inter nal parallelism and a unique Harvard architecture which allows program fetch to overlap data operations The C1x generation includes DSPs optimized for specific high performance applications such as speech synthesis high speed modems and telephone systems All TMS320C1x devices are software com patible for easy upgrade as application requirements change TMS320C1x ROM code versions can be used to reduce system costs On chip serial ports companding hardware and a coprocessor interface make the TMS320C17 ideal for telecommunications applications The TMS320C14 has been optimized for control applications such as disk drives and servo control The C14 is the industry s first device to combine the high performance of a
138. ou might enter C esr a user ti simuboard utilities Square brackets are also used as part of the pathname specification for VMS pathnames in this case the brackets are actually part of the pathname they are not optional In syntax descriptions the instruction command or directive is in a bold typeface font and parameters are in an italic typeface Portions of a syntax that are in bold should be entered as shown portions of a syntax that are in italics describe the type of information that should be entered Here is an example of a directive syntax asect section name address asect is the directive This directive has two parameters indicated by section name and address When you use asect the first parameter must be an actual section name enclosed in double quotes the second parameter must be an address Two vertical bars identify a parallel instruction An instruction that is preceded by two vertical bars will be executed in parallel with the previous instruction in the assembly language source file Here is an example of a parallel instruction MPYI3 R7 R4 RO ADDIS AR3 AR5 1 R3 Since the ADDI3 is preceded with two vertical bars the two lines of assembly language are considered a single instruction where both an integer multiply and integer add are performed Style and Symbol Conventions An at character O preceding a label or expression in an instruction indicates that direct addressi
139. pansion for TMS320C31 Compilers 00 cece eee ee 5 9 Register Variables and Register Tracking Targeting 0 0 eee eee eee eee eee 5 10 Repeat Blocks Autoincrement Addressing Modes Parallel Instructions Strength Reduction Induction Variable Elimination Register Variables and Loop Test Replacement for Floating Point Compilers 0000e eee eee 5 12 TMS320C31 Compiler Delayed Branch Optimizations 0202 000 eee ee 5 13 Loop Unrolling P L 5 14 The Basic Debugger Display oooooooonooronrrrnr ees 5 15 Debugger s Data Display oooooooooccornncrnr eee eens 5 17 TMS320G9Xx EVM Lebe ricetbrew rt P diee e 93 rada 5 24 TMS320C3x XDS Emulator o oooccccccccccccccoo sen 5 28 HP 64776 Analysis Subsystem c arenedrreissrnodiatises sn 5 31 Realtime Application Tasks 0 0 eect III 6 4 MX31 Fitted With a Preliminary CCD Camera Interface Daughter Board 6 16 SPOX Architecture csiiiss Re Lexk eb pere ad eee RR RR RR AUR RU RR eee CRURA AA 6 24 SPOX Debug Support srie apia aaia a t P e aa 6 28 Open Signal Processing Architecture ooooccccccccocccocn eee eee 6 29 Dram c O Figures AdaScope Debugger Screen 0 06 c cece eee ees 6 36 Logic Analyzer Family sesi meatia naia ana es te nents 6 38 TMS320 Device Evolution 00 0 cc o A 5 TMS320 Device Nomenclature 0 ccc eee an B 5
140. parison Versus Other Embedded Controllers 3 2 3 2 TMS320C31 Benchmark Performance Versus Other Embedded Controllers 3 4 3 2 1 Dhrystone Benchmark ococccccccccoccocccacnc eee ees 3 5 3 2 2 Bubble and Quick Sort Benchmarks 0000 eee eee eens 3 5 3 2 8 matmult Benchmark 0 000 eee 3 5 Contents ix Contents 3 24 anneal Benchmark osas cisnes dimi ndanin daii nett eens 3 6 3 2 5 Benchmark Summary 20 cece tenet eee eens 3 6 4 Application Examples ccce ere ram a eee ee 4 1 4 1 Telecommunications Example Using SPOX 0 0c cece eee eens 4 2 4 1 1 Speech Recognition With TMS320C31 and SPOX 0000005 4 2 4 1 2 Lower Cost and More Recognizers ooococcocccocncncncnc sese 4 2 4 1 3 VPRO 4 A Homogeneous Multi DSP Architecture 2 4 3 4 1 4 From Tiger 30 to Realtime Recognition cece eee eee 4 4 4 1 5 A New Level of Interoperability llle 4 4 4 2 Instrumentation Application and Processor Evaluation Example ss 4 5 4 2 1 Background and System Description 00 e eee 4 5 4 2 2 Archive Shuffle 1 ssi ce ee ne ls la mra dee ee de lee 4 7 4 2 3 Waveform Processing 0 c cece rn 4 7 4 2 4 Fast Fourier Transform 2 000 cece eee n 4 7 4 2 5 Advantages of a TMS320C31 System 0 cece eee 4 8 4 3 Test Equipment Example Using SPOX c
141. pe ripherals HostBus interface standardized 8 16 32 bit parallel interface for at taching microcontrollers also available for bit I O ExpansionBus interface TMS320C31 specific bus 32 bit parallel for expanding memory and attaching peripherals Serial interface Timer interface Emulation interface 6 7 Integrated Motion Incorporated Integrated Motion Incorporated 758 Gilman Street Berkeley California 94710 510 527 5810 FAX 510 527 7843 MX31 Modular Embedded System The MX31 is a low cost modular small footprint general purpose em bedded controller with expansion daughter boards designed for applica tions involving motion control The system is based on a motherboard daughter board architecture for flexibility and low cost The motherboard is a processor unit consisting of a 33 MHz TMS320C31 floating point DSP ROM RAM and other supportdevices Each daughter board provides the peripherals required to control a two axis servo actuated mechanical sys tem Up to four daughter boards can be stacked in a single system to con trol up to eight servo axes HM Motherboard features 33 MHz TMS320C31 floating point DSP 16 to 256K word ROM Up to 256K word zero wait state RAM RS232 serial port 16 bit parallel I O E Daughter board features 2 channel 16 bit shaft encoder interface 2 channel 16 bit analog output 12 bit digital input 6 bit digital output Up to 32K bytes nonvolatile RAM All digital I
142. peat interrupt register RM repeat mode bit RS repeat start register TOS top of stack PC program counter C Carry bit Central Processing Unit CPU Table 2 6 Load and Store Instruction Summary inma Beim partion Load floating point exponent src exponent gt Rn exponent LDFcond Load floating point value conditionally If cond true src Rn Else Rn is not changed LDFI Load floating point value interlocked Signal interlocked operation src gt Rn Else Dreg is not changed il Store integer interlocked LEGEND src general addressing modes Dreg register address any register src three operand addressing modes Rn register address R7 RO src2 three operand addressing modes Daddr destination memory address Csrc conditional branch addressing modes ARn auxiliary register n AR7 ARO Sreg register address any register addr 24 bit immediate address label count shift value general addressing modes cond condition code see Chapter 11 SP Stack pointer ST status register GIE global interrupt enable register RE repeat interrupt register RM repeat mode bit RS repeat start register TOS top of stack PC program counter C Carry bit TMS320C31 Architectural Overview 2 15 Central Processing Unit CPU Table 2 7 Arithmetic Instruction Set Summary 2 16 MON IC Seen A Arithmetic shift If count 2 0 Shifted Dreg left by count Dreg Else
143. priority over others and those vari ables whose uses don t overlap may be allocated to the same register Vari ables with specific requirements are allocated into registers that can accom modate them Autoincrement Addressing Modes For pointer expressions of the form p p p or p the compiler uses efficient TMS320C31 autoincrement addressing modes In many cases where code steps through an array in a loop such as for i 0 lt N i a i the loop optimizations convert the array s references to indirect refer ences through autoincremented register variable pointers See Figure 5 5 Repeat Blocks The TMS320C31 compiler supports zero overhead loops with the RPTS re peat single and RPTB repeat block instructions The compiler can detect loops controlled by counters and generate them by using the efficient repeat forms RPTS for single instruction loops or RPTB for larger loops For both forms the iteration count can be either a constant or an expression See Figure 5 3 and Figure 5 5 Induction variable elimination and loop test replacement allow the compiler to recognize the loop as a simple counting loop and then generate a repeat block Strength reduction turns the array references into efficient pointer autoincre ments Development Support 5 11 TMS320C3x Optimizing ANSI C Compilers Figure 5 5 Repeat Blocks Autoincrement Addressing Modes Parallel Instructions Strength Reduction Inducti
144. r An application can _Create any number and any type of tasks subject only to available memory After system initialization the Precise MPX kernel will _Create a user specified main task and dispatch this task The main task is written by the user to create and dispatch all remaining components of the realtime application Once a task has been created it will execute subject to its own priority and the actions it performs Task switching occurs only when a task executes a Precise MPX primitive that readies a higher priority task or when an interrupt event readies a higher priority task E Inter Task Communication Inter task communication and task synchronization are supported with messages passed between tasks A software designer usually 6 20 Precise Software Technologies Inc uses the _Sena Heceive or Reply primitives for message passing These three primitives are the core interface to the Precise MPX executive The structure of a design is represented by how the application uses inter task communication _Send is used to send a message to another task and cause the kernel to ready that task and run it Receive is used by a task to request that a message be sent to it and cause the kernel to ready anothertask Reply is used to issue a response from a receiving task to a sending task and to ready the sending task Thus with three simple primitives a designer can speci fy all inter task comm
145. r a wide range of inter operable products With SPOX serving as the common thread application developers and system integrators not only can apply these products to ward solving today s problems but are also afforded a bridge to future DSP technologies through the SPOX OSPA Open Signal Processing Ar chitecture Figure 6 5 depicts the OSPA framework for interoperability Figure 6 5 Open Signal Processing Architecture Applications Laboratory Systems Audio Librar Design Tools Image Library Speech Librar Compilers SPOX Telecommunications Debuggers Tools Library Device Host Operating System SPOX Drivers Computer DSP Boards Peripheral Bi Board Level Products SPOX is rapidly proliferating across a wide variety of board level prod ucts targeted for current and emerging bus architectures VMEbus NuBus EISA SBus etc allowing developers to buy off the shelf DSP platforms and data acquisition boards rather than building cus tom hardware TMS320C31 Third Party Support 6 29 Spectron Microsystems Inc 6 30 Program Development Tools SPOX supports development of application programs using high lev el languages including C and Ada Several source level debuggers are also being enhanced with knowledge of the SPOX runtime envi ronment DSP Function Libraries A growing number of vendors are offering platform independent DSP functions ranging from SPOX compatible math libraries for au dio or image processing to
146. r compares the device features and performance of the TMS320C31 to other embedded controllers The TMS320C31 s CPU pro vides higher system and numeric performance than CISC microprocessors and microcontrollers and also provides higher sustained numeric performance than RISC embedded controllers The TMS320C31 also incorporates several peripherals on chip which helps reduce system cost and complexity It also possesses a significant amount of on chip memory which facilitates the real time execution of time critical routines reducing the need for expensive high speed external memory The topics discussed include Topic Page 3 1 TMS320C31 Feature Comparison Versus Other Embedded Gontrollers toallita atlas aa 3 2 3 2 TMS320C31 Benchmark Performance Versus Other Embedded GontrollerS 2m a O E E E TEES 3 4 3 1 TMS320C31 Feature Comparison Versus Other Embedded Controllers 3 1 TMS320C31 Feature Comparison Versus Other Embedded Controllers Table 3 1 lists and describes the fields shown in Table 3 2 Table 3 2 high lights the features and performance of several embedded controllers in the same price range including the TMS320C31 Table 3 1 Description of the Fields in Table 3 2 Chi r t Multiply Time ns Integ Float The time the processor takes to perform a single nonpipelined integer multiply floating point multiply 3 2 TMS320C31 Feature Comparison Versus Other Embedded Controllers Table 3 2 Fea
147. r intercon nection network and the designer was able to reconfigure the application to take advantage of the number and type of processors used in the hard ware without having to change the design or any applications source code Spectron Microsystems Inc 6 10 Spectron Microsystems Inc 5266 Hollister Avenue Santa Barbara CA 93111 805 967 0503 FAX 805 683 4995 SPOX Architecture SPOX is a highly modular and configurable runtime environment that sup ports the C3x hardware platforms and can be integrated with application programs targeted for these systems While it provides most of the func tionality found in many realtime executives used with general purpose mi croprocessors SPOX has been specifically designed for the more de manding environment of TMS320C3x based DSP systems Extensive numeric computation Realtime I O High frequency data rates Limited program memory Multi DSP system architectures Integration with an adjoining host computer Because of its modular software architecture SPOX can address a wide range of DSP applications telecommunications imaging speech and audio test and measurement and multimedia to name a few without comprising system functionality and performance The SPOX runtime en vironment can be reduced to as little as a few thousand words of code for small embedded applications requiring only a limited number of kernel functions SPOX can also be integrated into a more comprehensi
148. rademark of Sonitech International Inc SPOX is a trademark of Spectron Microsystems Inc Tiger 30 is a trademark of DSP Research Inc VPRO 4 is a trademark of Voice Processing Corporation If You Need Assistance If you want to Request more information about Texas Instruments Digital Signal Processing DSP products Order Texas Instruments documentation Ask questions about product operation or report suspected problems Report mistakes in this document or any other TI documentation If You Need Assistance Do this Call the CRCT 800 336 5236 Or write to Texas Instruments Incorporated Market Communications Manager MS 736 P O Box 1443 Houston Texas 77251 1443 Call the CRCT 800 336 5236 Call the DSP hotline 713 274 2320 Fill out and return the reader response card at the end of this book or send your comments to Texas Instruments Incorporated Technical Publications Manager MS 702 P O Box 1443 Houston Texas 77251 1443 t Texas Instruments Customer Response Center Read This First vii viii 1 2 3 Contents Introduction 0h or o a a e 1 1 1 1 Embedded Controller Requirements 0 0 cece eee eee eens 1 2 1 2 TMS320C31 Key Features 0 n 1 3 13 Compatible Devices iconos cir ud ursii Vrd ee E eg Ur VER Ced ie 1 5 1 4 TMS320C31 Development Support 0 000 eee eee 1 6 1 5 Benefits of a TMS320C31 Based Embedded Syst
149. rated divide down clock is provided The se rial port can also be configured as timers or bit I O pins A special handshake mode allows the TMS320C31s to communicate via their serial ports with auto matic synchronization Direct Memory Access DMA 2 6 Direct Memory Access DMA The on chip DMA controller can read from or write to any location in the memory map without interfering with the operation ofthe CPU The DMA con troller can be configured to synchronize transfers with external serial port or timer interrupts Therefore the TMS320C31 can interface to slow memories and to on chip and system peripherals without reducing throughput to the CPU The DMA controller contains its own address generators source and destination registers and transfer counter Dedicated on chip DMA address and data buses minimize conflicts between the CPU and the DMA controller for on chip resources A DMA operation consists of a block or single word transfer to or from memory Figure 2 6 shows the DMA controller with associated buses Figure 2 6 DMA Controller Dm DMADATA Bus A IE r e i r DMAADDR Bus y 2 E h p e h r e a r a DMA Controller E A t d Global Control Register a d A T B e Source Address Register i s S Destination Address s Register P B Transfer Counter Register L TMS320C31 Device Overview 2 27 External Bus Operation 2 7 External Bus Operation The TMS320C31 prim
150. rates or building block flexi bility can easily use 2 or more TMS320C3x DSP chips to make simple easy to use multiprocessor systems The maximum capabilities of the hardware can be realized by using the Precise MPX executive Precise MPX is a library of primitives that are used by a realtime software designer to extend the C language to a real time concurrent C language with transparent support for multiprocessor applications Designing applications using a concurrent programming model is the simplest and most natural paradigm for expressing a realtime problem in terms of a high level programming language and is the basis for modern programming languages such as Ada C Objective C and Smalltalk The Precise MPX kernel has been designed such that the benefits of this programming paradigm can be successfully applied to real time embedded controller applications These capabilities are provided in a very efficient ROMable kernel that typically requires only 16K bytes Additional benefits of using Precise MPX are Bl Portability the concurrent paradigm is hardware independent M Reusability task objects communicate with other task objects or physical interrupts via specified interfaces BI Scalability any application that uses Precise MPX can be mapped from one to any number of DSPs without any change to the application software and no increase in the kernel overhead in fact the overhead decreases TMS320C31 Third Party Support 6 19
151. rchitecture In a SPOX multiprocessing system a copy of SPOX OS is required at each node of the system to manage load resources such as tasks and memory The following independent software modules are provides E Inter task communication application programming interface API H Multiprocessor global shared memory manager E Shared memory interprocessor resource locks H On chip peripheral support Debug Support The C source debugger can provide the following debug and profile capa bilities via additional runtime support to the SPOX OS and extensions to the debugger as shown in Figure 6 4 Display of SPOX OS objects Set task specific breakpoints Monitor and display system performance characteristics Invoke SPOX OS system calls TMS320C31 Third Party Support 6 27 Spectron Microsystems Inc Figure 6 4 SPOX Debug Support 6 28 SPOX DBUG SPOX DBUG Runtime TMS320C3x 4x Target System CEE CO GLX 3 J JY Y A E JY Y YA y y yy y es YT JS Y am SVS y JY Y Y Y Y EM Development Host With In Circuit Emulation Controller SPOX Products SPOX products that are generally used by application developers and system integrators include Software Components for Embedded Systems For customers who build and develop realtime embedded DSP sys tems SPOX is offered as a suite of software components sho
152. rovides for convenient communications Full debug monitor software is included for dynamic debugging EVB features include 1 M static RAM Wire wrap area Dual port memory Dynamic debug software C compiler Up to 40 MHz operation 6 40 TMS320C31 Third Party Support 6 41 Appendix A TMS320 DSP Family Digital signal processors are programmable microprocessors designed for speed and flexibility While they provide functionality similar to traditional mi croprocessors they are distinguished by architectural differences which opti mize their ability to quickly process complex mathematical formulas This appendix describes the evolution of the DSP market and the role of Tl in this market The TMS320 roadmap and a description of each generation of de vices are also presented Topic Page A N The DSP Market jcc sc0ecens ia paces oe cine see A 2 A 2 The TI Role in the DSP Industry eeeeeeeeeeeee A 3 A 3 The TMS320 Product Roadmap eeeeeeeeeeeeeee A 4 A4 TMS320C1X aaa E EEEE sees A 9 A5 TMS320C2x conan a a atea tale aa ea see je A 10 A 6 TMSIZOCBX gt cocoa aaa ae aaa lata aj A 11 A 7 TMS320GC 4x nino aa laa lefeielenieie eere eel EE A 12 A8 TMS320C5X O A 13 A 1 The DSP Market A 1 The DSP Market A 2 Over the last decade DSP technology has made new products possible and many applications affordable In the early 1980s DSPs provided an off the shelf alternative to cust
153. rt product TMX and TMP devices and TMDX development supporttools are shipped with the following disclaimer Developmental product is intended for internal evaluation purposes F7 1 Note Texas Instruments recommends that prototype devices TMX or TMP not be used in production systems because their expected end use failure rate is undefined but predicted to be greater than standard qualified production devices LLLL TMS devices and TMDS development support tools have been fully character ized and their quality and reliability have been fully demonstrated Texas In struments standard warranty appliesto TMS devices and TMDS development support tools TMDX development support products are intended for internal evaluation pur poses only They are covered by Texas Instruments Warranty and Update Policy for Microprocessor Development Systems products however they should be used by customers only with the understanding that they are devel opmental in nature B 3 Device Suffixes Device Suffixes The suffix indicates the package type e g N FN or GB and temperature range e g L Figure B 1 presents a legend for reading the complete device name for any TMS320 family member Figure B 1 TMS320 Device Nomenclature TMS 320 30 GB E Prefix Temperature Range
154. ry The assembler will search through the library and use the members that are called as macros by the source file Also itis possible to use the archiver to collect a group of object files into an object library The linker will include the members in the library that resolve external references during the link Most EPROM programmers do not accept COFF object files as their input The ROM3O0 object format converter must be utilized to convert the COFF object file into Intel Tektronix or Tl tagged hex object format ROM30 is part of the assembler linker and archiver package The converted file can then be down loaded into the EPROM programmer TMS320C3x Software Simulator 5 4 TMS320C3x Software Simulator A simulator is a software program that simulates the TMS320C3x micropro cessor and microcomputer modes for cost effective software development and program verification in non realtime With the inexpensive software simu lator you can debug without target hardware Files can be associated with I O ports so that specific I O values can be used during test and debug Time criti cal code as well as individual portions of the program can be tested The clock s counter allows loop timing during code optimization Breakpoints can be established according to read write executions using either program or data memory or instruction acquisitions The simulator uses the standard C assembly source debugger interface described in Section 5 1 al
155. s data is input or output have just as much effect on overall system performance as does the algorithm itself Using the SPOX memory management functions application programs create individual array objects whose respective data buffers can be dy namically allocated and freed during the course of execution Unlike the standard C functions malloc and free the SPOX array functions en able the application to supply a parameter specifying the segment of memory in which these buffers will reside Since production DSP hard ware platforms typically contain a hierarchy of memory types on chip RAM external SRAM bulk DRAM etc retaining explicit control over the location of data becomes essential to meeting realtime constraints in many applications SPOX OS supports device independent I O meaning that a uniform set of I O operations are mapped into an otherwise diverse set of devices The high level nature of device independent I O operations provides a consis tent programming interface for a number of off the shelf device drivers for accessing and controlling each device within the system and insulates ap plications from the low level details of managing these devices TMS320C31 Third Party Support 6 25 Spectron Microsystems Inc 6 26 SPOX OS also provides a mechanism for adding platform dependent driv ers software modules that encapsulate low level hardware details by in terpreting device independent I O requests in a device d
156. ser s code Maintains an SPC section program counter for each section of object code Defines and references global symbols Assembles conditional blocks Supports macros allowing the user to define macros either in line with or within a macro library Development Support 5 19 TMS320C31 Assembly Language Tools 5 20 The linker combines object files into a single executable object module As it creates the executable module it performs relocation operations and resolves external references The linker accepts COFF common object file format ob ject files created by the assembler as its input It can also accept archive li brary members and modules created by a previous linker run Linker directives allow you to combine object file sections bind sections and symbols to specific addresses and define redefine global symbols The linker has these features Defines a memory model that conforms to the target system s memory Combines object file sections Allocates sections into specific areas within the target system s memory Defines or redefines global symbols to specific values Relocates sections to final addresses Resolves undefined external references between the input files DOCDCOCDCOLO O Allows separate load time and runtime addresses for sections of code The archiver makes it possible to collect a group of files into a single archive file For example several macros can be collected together into a macro li bra
157. sion when memory is unavailable Optional timeout for any task suspension System history log Task performance analysis facilities Task oriented debugger MS DOS compatible floppy file system TCP IP network support Q4 92 Technical Support Structured and documented source code Detailed programmer s reference manual Detailed internal design manual Telephone consultation Warranty and maintenance service Extensive counseling and contract services Shipping Media MS DOS 5 1 4 inch diskette TMS320C31 Third Party Support 6 3 Accelerated Technology Inc Figure 6 1 Realtime Application Tasks Hardware Interrupt Handlers Context Save Restore Service Requests Time Slice Timeout Clock Suspension Management of Management Queues Timer Events Schedule Request Resources Management Task Scheduling Requests Fixed Memory Control of Application Task Variable Memory Development Support Task previously suspended by service call Control Relinquished to Application Task Control Relinquised to Service Application Task Requests Realtime Application Task Schedule Management Development Support NU Start NU Reset Performance Timer NU Change Priority NU Retrieve Next History Entry NU Change Time Slice NU Retrieve Performance Info NU Control Interruptor NU Start History Saving NU Enable Preemption NU Start Performance Timer NU Disable Preemption
158. sor mode pin SHZ 1 Shut down high Z An active low shuts down the TMS320C31 and places all pins in a high impedance state This signal is used for board level testing to en sure that no dual drive conditions occur CAUTION An active low on the SHZ pin corrupts TMS320C31 memory and register contents Reset the device with an SHZ 1 to restore it to a known operating condition XF1 XFO 2 VO Z External flag pins They are used as general purpose I O pins or to support interlocked processor instruc tions Serial Port 0 Signals 6 Pins CLKRO 1 VO Z Serial port 0 receive clock This pin serves as the serial S shift clock for the serial port O receiver CLKXO 1 VO Z Serial port 0 transmit clock This pin serves as the serial S shift clock for the serial port O transmitter 1 VO Z Data receive Serial port 0 receives serial data via the S DRO pin DXO 1 VO Z Data transmit output Serial port 0 transmits serial data S on this pin FSRO 1 VO Z Frame sychronization pulse for receive The FSRO S pulse initiates the receive data process over DRO FSXO 1 VO Z Frame synchronization pulse for transmit The FSXO S R pulse initiates the transmit data process over pin DXO Z t Input I output O high impedance state Z S SHZ active H Hold active R Reset active TMS320C31 Architectural Overview 2 31 TMS320C31 Signal Descriptions Table 2 9 TMS320C31 Signal Descriptions Concluded 1 0 Zt Description Condition When Signal Is
159. such applications is understood to be fully at the risk of the customer using TI devices or systems TI assumes no liability for applications assistance customer product design software performance or infringement of patents or services described herein Nor does TI warrant or represent that any license either express or implied is granted under any patent right copyright mask work right or other intellectual property right of TI covering or relating to any combination machine or process in which such semiconductor products or services might be or are used Copyright O 1992 Texas Instruments Incorporated Read This First How to Use This Manual Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Appendix A Appendix B This document contains the following chapters Introduction A general description of the TMS320C31 its key features benefits embedded controller requirements compatible devices and development support TMS320C31 Architectural Overview Functional block diagram TMS320C31 architecture description hardware components and device operation Instruction set summary TMS320C31 Features Performance Comparison Comparison of TMS320C31 benchmark performance and feature values versus those of other embedded controllers Application Examples Four application examples showing how the TMS320C30 and TMS320C31 have been used for system control functions in several application areas Devel
160. supported by the TMS320C31 Figure 2 3 shows these internal buses and their connection to on chip and off chip memory blocks The program counter PC is connected to the 24 bit program address bus PADDR The instruction register IR is connected to the 32 bit program data bus PDATA These buses can fetch a single instruction word every machine cycle The 24 bit data address buses DADDR1 and DADDR2 and the 32 bit data data bus DDATA support two data memory accesses every machine cycle The DDATA bus carries data to the CPU over the CPU1 and CPU2 buses The CPU1 and CPU2 buses can carry two data memory operands to the multiplier ALU and register file every machine cycle Also internal to the CPU are regis ter buses REG1 and REG2 that can carry two data values from the register file to the multiplier and ALU every machine cycle Figure 2 2 shows the buses internal to the CPU section of the processor The DMA controller is supported with a 24 bit address bus DMAADDR and a 32 bit data bus DMADATA These buses allow the DMA to perform memory accesses in parallel with the memory accesses occurring from the data and program buses 2 5 On Chip Peripherals On Chip Peripherals All TMS320C31 peripherals are controlled through memory mapped registers on a dedicated peripheral bus The peripheral bus is composed of a 32 bit data bus and a 24 bit address bus The peripheral bus permits straightforward com munication to the peripherals
161. t 0 Port Control Register T gt IMITV 3mM FSXO DXO CLKXO FSRO DRO CLKRO R X Timer Register Data Transmit Register Data Receive Register ACW oomuaoo Global Control Register Timer Period TCLKO Register Timer Counter Register Global Control Register Timer Period TCLK1 Register Timer Counter Register Port Control TMS320C31 Architectural Overview 2 3 Central Processing Unit CPU 2 2 Central Processing Unit CPU 2 4 The TMS320C31 has a register based pipelined CPU architecture The C31 CPU is similar to a RISC microprocessor CPU in that most instructions execute in asingle cycle However the C31 instruction setis more powerful multiple operations can be performed in a single instruction cycle and the operands of logical and arithmetic instructions can be read from memory and operated on in a single cycle Because its separate multiplier and ALU are incorporated into the CPU the C31 supports single cycle logical and arithmetic operations These units do not require pipelined staged execution to achieve maximum performance allowing the C31 to achieve low latency execution of numeric operations In addition the same multiplier and ALU are used for both integer and floating point math providing you flexibility and equal performance for ei ther data format The TMS320C31 can perform a multiply and ALU operation in a single cycle allowing realtime DSP or other math and logical
162. technical staff can offer applications assistance with customer designs through local Regional Technology Centers Services include L Design assistance L Simulation Emulation Each Regional Technology Center uses up to date development systems in cluding workstations and personal computers plus demonstration test and evaluation equipment TI staff designers use fully equipped laboratories to provide efficient design assistance Development Support 5 37 TMS320 Technical Support 5 38 The first step to a successful design is an explanation of the project s parame ter production requirements design function s and price The results of these discussions will allow Tl and a customer to explore L Design cost trade offs Product implementation options Once the various trade offs options are selected and approved Texas Instru ments can provide further assistance in the design of a customer s product sharing a mutual goal of bringing a successful product to market as quickly as possible 5 9 8 RTC Locations RTC Locations The following list gives the worldwide locations of the TI Regional Technology Centers RTC Worldwide Locations North American Locations ATLANTA Texas Instruments 5515 Spalding Drive Norcross GA 30092 404 662 7950 BOSTON Texas Instruments 950 Winter Street Suite 2800 Waltham MA 02154 1263 617 895 9196 CHICAGO Texas Instruments 515 W Algonquin Road Ar
163. tems for the TMS320C3x and TMS320C4x DSPs The compiler targeted to the C30 has been validated by the U S Government s Ada Compiler Valida tion Capability under test suite version 1 11 Standard components of the compilation systems are Highly optimizing compiler Ada Librarian Small modular runtimes Standard predefined Ada packages ARTclient package permitting access to tasking data structures and operations Intrinsics package permitting access to hardware capabilities Math package of elementary functions Cross reference facility AdaScope debugger Linker object librarian and utilities Help facility and documentation The Ada compiler produces fast compact code through Ada specific opti mizations optimizations that take advantage of the processor s architec ture features and a full range of classical optimizations Five optimization levels permit proper optimization strategy at each point in the develop ment cycle Support for Ada language features include Ml Representation specifications for type sizes record layout enumera tion values object addresses and interrupt entries Unchecked deallocation and conversion Insertion of routines written in machine code All Ada predefined pragmas and the implementation defined prag mas Foreign_body and Linkage_name C3x and C40 specific features include B Access to many processor specific native instructions BI Circular and bit reversed addressing BI Delay
164. tions TMS320C3x Emulator The TMS320C3x XDS emulator see Figure 5 11 is a user friendly PC based development system that supports hardware development on the TMS320C30 and TMS320C31 This emulator provides a means for develop ing the software and hardware within a target system Access is provided to every memory location and register of the TMS320C3x through the use of a revolutionary scan path interface The TMS320C3x XDS emulator board inter prets commands and converts these commands into the appropriate signal sequences necessary to control the TMS320C3x in your target system Key features of the TMS320C3x XDS emulator include E O D O O Full speed execution and monitoring of the TMS320C3x in your target System via a 12 pin target connector TMS320 C assembly source debugging PC MS DOS via Tl s standard windowed Programmer s Interface see Section 5 2 200 software breakpoints Software trace timing Single step execution Loading inspecting modification of all registers Uploading downloading of program memory and data memory Benchmarking of execution time of clock cycles Development Support 5 27 TMS320C3x Emulator Figure 5 11 TMS320C3x XDS Emulator 5 28 Insert Negative F Get this photo from the TMS320 Family Development Support Reference Guide job 61136 page 5 28 Figure 5 10 Software breakpoints allow program execution to be halted at a specified instruction address When a given brea
165. titasking SPOX OS applications It allows developers to perform debug and profile functions from within the C debugger B SPOX MPis a set of software functions that provide a foundation for multi DSP applications These include interprocessor communication Spectron Microsystems Inc primitives management of shared memory and the ability to reassign tasks across processor boundaries Realtime Multitasking The SPOX OS offers all of the features typically found in other realtime multitasking kernels Preemptive event driven scheduling Dynamically prioritized tasks Synchronization and communication facilities Timer services Handling of device interrupts By offering these features SPOX OS enables realtime multitasking ap plications typically relegated to general purpose microprocessors to execute on the DSP Older configurations with 16 bit DSPs used as slave processors controlled by a more intelligent general purpose master can now be replaced by single chip 32 bit DSP solutions Thus SPOX man ages multiple tasks executing numerically intensive algorithms in parallel with other system control and communication functions Memory Management Device Independent I O and Host Commu nication While numerical processing may dominate DSP applications memory al location I O and communication are equally vital when turning a theoreti cal algorithm into a practical application Where the data is located in memory and how thi
166. ts sssr asss sse aS EEEE aE sierra EEE EEE E EE EEE 2 29 2 9 TMS320C31 Signal Descriptions oooocccccocrnn o 2 30 2 1 TMS320C31 Block Diagram 2 1 TMS320C31 Block Diagram Figure 2 1 is a block diagram of the TMS320C31 architecture Throughout this chapter refer to this block diagram to better understand the interface of the components of the C31 embedded controller 2 2 Figure 2 1 TMS320C31 Block Diagram PDATA Bus RAM Block 0 1K X 32 RAM Block 1 1K X 32 PADDR Bus DDATA Bus DADDR1 Bus XF 1 0 4 Vpp 3 0 IODVpp 1 0 gt ADVpp 1 0 gt PDVpp DDVpp 1 0 gt MDVpp Vss 3 0 gt DVss 3 0 gt CVss 1 0 dy U Eb Cc o N T R 9 E E E R DADDR2 Bus DMADATA Bus DMAADDR Bu M 32 4 24 Wis 00m To C ITI ZU 32 Bit Barrel Shifter ALU Extended Precision Registers R7 RO DMA Controller Global Control Register Source Address Register Destination Address Register Transfer Counter Register Auxiliary Registers ARO AR7 Other Registers 12 exo ng 99 sa nn ae Qcu PAPU TMS320C31 Block Diagram Serial Por
167. ture Performance Comparison of Embedded Controllers Device MHz On Chip Peripherals Multiply BAM Serial Timer DMA Chan Time ns Bytes Ports Integ Float x 1 1 16 1 16 2048 2 180 NA MC68331 16 2 4 1 16 M 1 180 NA aoas 1218 1 82 sme o v o roo Key NA The device does not support this feature in hardware TMS320C31 Feature and Performance Comparison 3 3 TMS320C31 Benchmark Performance Versus Other Embedded Controllers 3 2 TMS320C31 Benchmark Performance Versus Other Embedded 3 4 Controllers The best method to evaluate a processor s performance in a given application is to benchmark the execution time of the applications software under target system constraints The next best evaluation method is to benchmark the per formance of similar code or code that is representative of the target applica tion However due to short product development cycles the processor evalu ation period is rarely long enough to do the code development and system emulation necessary to perform such a rigorous performance analysis for each candidate device Consequently many system designers use published device benchmarks to obtain rough performance estimates for different classes of algorithms Table 3 3 shows the published manufacturer benchmarks for several C lan guage programs These benchmarks have been used by processor manufac turers to highlight the general performance of their devices and are a subset of a
168. ture is available to assist you through the de sign cycle These documents include product and preview bulletins data sheets user s and reference guides over 2000 pages of application notes and textbooks offered by Prentice Hall John Wiley and Sons and Computer Science Press To inquire about available TMS320 literature call the Custom er Response Center CRC 214 995 6611 The following list describes the general contents of each major category of technical documentation available through the Customer Response Center Product and preview bulletins and product briefs give an overview of the devices and development support within the TMS320 family presenting capabilities diagrams and hardware software applications User s guides for TMS320 processors provide detailed information re garding the architecture of the device its operation assembly language instructions and hardware and software applications _ Data sheets include electrical specifications timing characteristics and mechanical data for a device L Application books reports describe theory and implementation of selected TMS320 applications including algorithms code and block schematic logic diagrams Currently there are over 2000 pages of application reports to support the TMS320 family Technology brochures provide an overview of various implementations of DSP technology Development Support 5 33 TMS320 Technical Support 5 9 2 Deta
169. tween the two is needed The PC C31 is fitted with two of LSI s daughter module sites giving it ac cess to the high quality interfaces that make up the daughter module range This presently comprises both delta sigma and successive approx imation devices and is continually expanding Using the currently avail able successive approximation modules it is possible to construct a 4 in put 4 output analog system with a maximum sampling frequency of 200 KHz on the inputs and 500 KHz on the outputs The modules are designed for quality of conversion Signal to noise and distortion figures of 90 dB for the delta sigma part have been measured with modules mounted on DSP boards and placed within a PC Parallel expansion is provided by an updated version of LSI s DSPLINK interface standard The bus provides a standardized interface to all of LSI s DSP boards and allows the use of a range of readily available periph eral boards including multichannel analog I O and AES EBU pro audio digital interfaces The DSPLINK specification is published allowing users to easily interface a custom design to the bus Improvements to the origi nal DSPLINK include a 32 bit data bus and additional address lines Code development support will be provided by the Texas Instruments floating point DSP tools that include an optimizing ANSI C compiler as sembler and linker These tools cover the whole TI floating point DSP range making upgrades or changes to from other
170. unication and all scheduling required for a con current application Interrupt Management Precise MPX supports dynamic direct connection to interrupts Inter rupts can be either exceptions generated by the DSP or external de vice interrupts The software designer is responsible for writing the in terrupt service routine called the notifier Notifiers can be implement ed in C or in assembly language Interrupts and notifiers can be de fined during executive initialization or they can be installed by any task during execution Notifiers are equivalent to tasks except they do not require the over head of tasks and are not scheduled by the executive A task that is ready to receive an interrupt uses the Await interrupt primitive A notifier needs only to perform two actions to reply to a waiting task First it calls Task awaiting interrupt to determine which task is waiting Then it calls Add ready which readies the waiting task Memory Management Precise MPX includes a dynamic memory manager that tasks use to allocate extra temporary or private memory areas exclusive of the tasks stack The memory management algorithm is a first on request On release it groups together the nearest neighbors to minimize memory fragmentation Server Management Precise MPX includes primitives that support Client Server design paradigms The client server model is a powerful design method for developing robust reusable applications for commu
171. ve envi ronmentthat supports larger applications executing a variety of numerical ly intensive algorithms and performing system control and communica tion functions TMS320C31 Third Party Support 6 23 Spectron Microsystems Inc Figure 6 3 SPOX Architecture 6 24 LM d SPOX DBUG Figure 6 3 depicts the overall architecture of SPOX illustrating its major functional capabilities along with their organization into the following dis tinct software components B SPOX OSis the foundation of SPOX that provides a set of system ca pabilities that include memory management supplying dynamic al location of arrays from multiple memory segments hardware inter rupt handling control of multiple realtime tasks executing within a single program and a uniform device independent stream l O inter face to platform specific drivers that manage peripherals used for sys tem I O and communications It serves as the foundation for the re maining application libraries and system components B SPOXLIBC is a library of standard C runtime environment that pro vides rudimentary file I O capabilities on the DSP or seamless integra tion with adjoining host computer file system B SPOX MATH is a comprehensive library of optimized DSP math func tions that operate on vectors matrices and filters B SPOX DBUG extends the capabilities of DSP C source debuggers such as the Texas Instruments db30 to simplify the development of realtime mul
172. velopment Support Throughout the design of the TMS320C3x DSPs hardware and software engi neers worked with device architects to create a processor ideally suited to today s development tool technologies The result is a full set of hardware and software tools From the friendly Programmers Interface to Tl s unique scan based emulator the development environment makes the design of em bedded systems fast and easy This chapter provides an overview of the development support products sup porting TMS320C3x design EE Note A floating point compiler assembler and linker support the TMS320C31 TMS320C30 TMS320C40 and all future spin offs of the C3x and C4x gen erations Complete support for all 32 bit TMS320 processors provides an ef ficient upgrade path without requiring the purchase of additional compilers assemblers or linkers Throughout this chapter this compiler will be referred to as the TMS320C31 compiler the assembler linker will be referred to as the TMS320C31 assembler linker Topic Page 5 1 TMS320 Optimizing ANSI C Compilers oooooooooommooo 5 2 5 2 TMS320 Programmer s Interface C Assembly Source Debugger 5 15 5 3 TMS320C31 Assembly Language Tools 5 19 5 4 TMS320 Software Simulators eeeeeeeeeeee 5 21 5 5 TMS320C3x Evaluation Module 00eeee seen eee eee 5 24 5 6 SnMS320C3x Emulatorne eee ECERCECEPRPEEHRIEEEDRCRCEEPPEPRCDEEODEEE 5 26
173. wn in Figure 6 3 which can be configured and customized for the custom er s hardware Application Library Packages All major suppliers of plug in DSP boards offer the complete library of SPOX application functions for C runtime environment realtime stream l O DSP math and host DSP communication These SPOX application library packages are transforming PCs and workstations into signal processing systems that integrate the flexibility of a host computer with the power of attached DSP hardware SPOX Evaluation Kit The SPOX EVM evaluation system provides DSP system developers with a low cost easy to use solution for evaluating the SPOX system kernel on a TMS320C3x hardware platform The SPOX EVM product Spectron Microsystems Inc streamlines the evaluation process by integrating all ofthe necessary hardware and software components into a single turnkey package m TMS320C3x EVM hardware platform m TMS320C3x C compiler and assembly language tools m SPOX OS software SPOXEVM can also serve as a development platform for building rap id prototypes of new DSP systems All application software developed initially under SPOX EVM can later be reused on any production hard ware platform using SPOX Open Signal Processing Architecture OSPA While improvements in application productivity and portability are proven benefits of SPOX the true power of a standard software interface to un derlying DSP hardware comes with bringing togethe
174. x 3 Ten II eve mes 27 59 ror 58 39 o qme 6 2 0 70 58 iex mex 8 2 5 GE T military version available planned contact nearest TI Field Sales Office for availability t Ser serial Par parallel DMA direct memory access concurrent with CPU operation Int internal Ext external Com parallel communication ports Tsixteen of these parallel I O ports are memory mapped single logical memory space for program data and I O minus on chip RAM peripherals and reserved spaces Il includes the use of serial port timers Dual buses Contains an on chip bootloader ROM Note Programmed transcoders TMS320SS16 and TMS320SA32 are also available K ROW mas saa Tsc 75K 54 Bd E dewpeoy jonpoJd OZESWL 9u1 The TMS320 Product Roadmap Table A 2 TMS320 Family Features and Benefits AA II Five generations of more than 25 compatible devices DSP to meet any application need Cycle times as fast as 35 ns Realtime DSP performance Choice of fixed point or floating point devices Hardware multiplier and barrel shifters Modified Harvard architecture Concurrent DMA program cache On chip data RAM up to 8 5K words program ROM EPROM Reduced system cost space up to 4K words and power consumption Serial port timer multiprocessor interface instruction cache DMA controller CMOS processing Large memory space up to 4 gigaw
175. you to debug a program in the language in which it is written Programs can be debugged in C assembly language or both The debugger also has profiling capabilities that show where to focus development time by quickly identifying the hot or time con suming sections of a program Figure 5 8 The Basic Debugger Display pulldown gt Load Brea Watch Memory Color MoDe Run F5 Step F8 Next F10 menus r DISASSLk M CALLS CPU j A 0002d 62 00042 CALL xcall A 2 ca11Q function call 0002e 19840001 SUBI 1 SP 1 main SP 00f 0207c di bl 0002 6a00000c BU cal1 30 RO 00000001 traceback Isassem gt 90x 0003c R1 00 00009 y 00030 08510b02 LDI AR3 2 IRO WATCH R2 00000007 display 00031 02f10003 AND 3 IRO 1 str a 0 natural format 00032 08282051 LDI 02051H ARO 2 FO 1 000000e y R4 00000003 1 00033 04 10003 CMPI 3 IRO 3 color GREEN R5 00000000 data displays 00034 51 10004 LDIHI 4 IRO R6 00000000 00 FILE sample c 42 0 R7 00000000 00 poos A ARO 00 00037 00 AR1 00000008 00 00054 call newvalue AR2 00000000 00 Mu int newvalue AR3 00 0207c 00 ARA 00000000 C source 0039057 static int value 0 AR5 00000000 displ 00 00058 3 AR6 00000000 ISp ay 00059 switch newvalue amp 3 AR7 00000000 00060 IRO 00000003 00061 case 0 str a newvalue break MEMORY 00062 case 1 st DISP astr 7 eturn A wor gasez i pa zs d a

Download Pdf Manuals

image

Related Search

Related Contents

UserCenter8 - Guntermann und Drunck  LED LCD TV - Lojas Colombo  Invisorb Blood Universal Kit User manual - Negev Bio  MANUAL DE INSTRUCCIONES NEBULIZADOR  MESH 2 - ADX Speakers  ダウンロード    Orion 72MM User's Manual  

Copyright © All rights reserved.
Failed to retrieve file