Home
        Embedded Edge
         Contents
1.               22 June 2001    Speecing the  Development of  Multi DSP Applications       By Fiona Culloch      Systems  Pegasus from Jovian  Systems          Accelera from  Spectrum Signal Processing  which  is specific to the company s boards     Diamond and Virtuoso ease the  software burden of coordinating  many processors with features like  Eonic s Virtual Single Processor  model and 3L s  virtual channels    These features free you from the  protocol complexities of forwarding  messages among a network of  processors and the interface details  of your particular communications  hardware  This class of products  also provides ready made software  that lets a range of off the shelf DSP  development boards communicate  with user written GUI code on a  host PC    Pegasus and Accelera are higher   level tools  graphical development  environments for multi DSP appli   cations  Instead of writing C code to  implement DSP algorithms and  interprocessor communications   you can simply choose from a  palette of predefined functional  blocks  or add your own  and visual     Embedded Edge    ly place the blocks onto processors   Both kinds of tool are useful   since you often want to work at both  levels  In fact  Pegasus can use  either Diamond or Virtuoso as its  underlying communications frame   work  Accelera uses Spectrum s own  quicComm software as its commu   nications layer  as it targets propri   etary communications hardware     COMMON PROBLEMS  WITH MULTIPROCESSORS    Ev
2.        Phase 1 involves compiling and  profiling your baseline C code   Before you begin any optimization  effort  use the profilingtools to iden   tify the performance critical areas  in your code    Phase 2 involves compiling with  the appropriate optimization op   tions and analyzing the feedback  provided by the compiler to improve  the performance of your code    Phase 3 is a critical phase during  which you use a number of tech   niques to tune your C code for bet   ter performance    Phase 4 is needed if the perfor   mance of certain areas of your code  must be improved beyond the tun   ing phase  After yet another profile  of the code  you can extract the per   formance critical areas and rewrite  them in linear assembly language     THE FIRST THREE PHASES    Phase 1 establishes your baseline   You have a goal   for example  your  system requirement might be a sta   tistical mix of 48 modems on one       iid  Develop C code    Compile    Profile      Density    requirements   Done    met   Phase 2    Semele        with optimization options    Profile      Density    requirements       OO Done  met  Yes  Tune C code    Compile    Profile      Density    requirements                    Done                       Phase 4 m   Write linear assembly code     Optimize assembly code     No   Profile      Density    requirements Done    met  Yes    Figure 1  The Code Composer Studio optimization tutorial recommends a code  development flow consisting of four phases  The first 
3.       Practical solutions for DSP system developers       SPRN154                   Modem     Parameters _       Optimizing       Soft        New        5 Speed    the Bevelopment    ef Multi DSP    Applications    The right emulator won   t leave you  stranded when new DSPs come along     fleais With new TMS320 DSPs coming out with dizzying regularity     you need a development platform that stays up to date  Only the ultra flexible       FleXDS from DSP Research lets you keep up with Texas Instruments  FleXDS is     new breed of emulator     a reconfigurable platform that expands to become  a complete hardware and software development system and a powerful test  bed for your TMS320 applications  No one has more experience with TI DSP  development than DSP Research  So when TI puts you to the test with a new    DSP  look to FleXDS for answers        DSPR                                                        1 408 773 104 2          Key FleXDS Features     PCI based in circuit emulator  JTAG POD with detachable  cable and diagnostic LEDs  DSP and       Expansion modules  for Tl s C6000  C5000 and more  More than 15 modules available  Debug software before your  target system is ready          Embedded Edge    A Texas Instruments Publication    Volume 2 June 2001 Number 2    Stan Runyon  Editor in Chief  testman2 earthlink net    Mike Robinson  Managing Editor  mrobinso cmp com    Tim Moran  Creative Director    Donna Moran  Art Director    Genevieve J oerger  Director  Custo
4.     APE    MARAN    eee                    ua mra reel  gerigeraricri                                   How Pisa    hemel                              rs        Embedded Edge       REFERENCES    Bateman  A ndrew  Digital Communications  Design for the Real  World  A ddison W esley  London  1998     Bateman  Andrew  and lain Paterson Stephens  The DSP  Handbook  Prentice H all  U pper Saddle River  N J   forthcoming     Kenington  Peter B   High Linearity RF Amplifier Design  Artech  House  Boston and London  2000        Andrew Bateman is the CEO of Avren Ltd   a design and con   sultancy company that advises communications and IC com   panies on technologies  standards  and partnerships for next   generation wireless systems  and of D SPStore com  Previously   he was a professor of communications and signal processing  at Bristol  U K   University and cofounded W ireless Systems  International Ltd   where he was the business development  director  He is also the author of three books on digital com   munications and digital signal processing     MESi is unlocking the  barriers to widespread  DSP software  modem use     MES    Wu  memi  met    June 2001 15       Soft M odems    In just a few short phases  you can optimize C coded  reference modems to meet higher density targets     Optimizing Modems  Using Code Composer  Studio and TI Resources            By Ghassan Farah       or many of the general telephony stream processing  tasks  the C optimizer for the Texas Instruments
5.     Embedded Edge          NFO Evaluation Board    An evaluation board that operates  with a three phase induction motor  takes advantage of the natural field   oriented  NFO  control algorithm  stored in the flash memory of  TMS320LF2406A 40 MIPS DSPs   The algorithm gives accurate  sen   sorless torque control over a wide  speed range  The board can work in  a stand alone mode  with speed or  position sensors feeding on board  control loops  or connected to a PC  through an optically coupled serial  link  Available in July and bundled  with a Labview user interface that  controls the motor and modifies  motor and control parameters  it  sells for  500  NFO Control AB   Lund  Sweden   46 46 286 29 26   www nfo se    Prototyping Daughterboards    Two prototyping daughterboards   ProtoPlus and ProtoPlus Lite let  developers construct a prototype  circuit that plugs into the TMS   320C2000 and C6000 DSP plat   forms  the C54x  and the DSK and  EVM development systems  The  boards give access to all isis and  power rails   They accept  external  12   V power  The  ProtoPlus  Lite  a two      layer board  sells for  125  The  ProtoPlus  a six layer board  has  separate ground  and 3 3  and 5 V  planes  it sells for  225  DSP  Global  Inc   Warwick  R I    401   737 9900  www dspglobal com       June 2001 35       Single Board Media   and Signaling Gateway  SuperSpan ll is a com   pletely integrated sin   gle board embedded    hardware and software  platform for imple   menting  c
6.     With the latest  DSP devices   you can design  a software   configurable   low cost  modem that   s  very efficient  and small     By Andrew Bateman    8 June 2001    WIRELESS  INTERNET   DEVICES USING  DSP SOFTWARE       ts a very exciting time if you re designing a wireless   modem  The Internet is the fastest growing sector of   the      market  and high speed wireless connection for   business and residential users is a huge business oppor    tunity  You also have unprecedented flexibility in the   range of wireless functions that can be implemented in  the digital domain  and new algorithms that fully exploit this  advantage are appearing     Low noise  amplifier Filter Mixer Filter                        DSP  Antenna    Microprocessor       E    Power Filter Mixer  amplifier    cm  e          lt  gt                                  gt       x lt                               4            e             Figure 1  The architecture of a modern software radio is surprisingly simple   comprising only a few core building blocks  Linearity in the power amplifier  and a good low noise front end and synthesizer are essential  as of course area  high speed  low cost  low power DSP engine and data converters  highlighted      Embedded Edge             Still  wireless modems present a serious technical  challenge  and that   s even truer for wireless LAN   WLAN  devices  Cost effectively distributing the  ultrahigh capacity of a fiber node  multiple gigabits per  second  to disp
7.     can                 Frid duis           T ROD  243 2733 ac email saber wai  cou    SRSC   WOW                       be           Embedded Edge       existing implementations from the  framework vendor and use that code  as a base for development         all cases  aS soon as a working  driver exists for the custom hard   ware  multiprocessor code written  with the framework   either on a  supported COTS board  on a single   processor EVM  or on Windows     should run unchanged on the new  hardware       Fiona Culloch  fc 3L com  is technical  Support director at 3L Limited in  Edinburgh  Scotland  where she has been  involved with real time kernel develop   ment and software to simplify integration  of DSPs with host GUI applications  From  1985 to 1987  she was software develop   ment manager for the compilers group at  Lattice Logic Limited        June 2001 29       A software based statistical real time profiler can help you  vastly improve the performance of TMS320C62x DSP code     Real Time Profiler Aids  Code Optimization       optimizing software code to boost   the performance of an applica   tion is one of the greatest challenges  in writing real time DSP software  At  Surf Communication Solutions   we ve found that the impact of effec   tive code optimization can be dra   matic  We ve achieved remarkable  performance gains of several orders  of magnitude during a project s devel   opment cycle    DSP software development fol   lows Pareto s rule  also known as  
8.     php                    rguipemeni                        tiga tia    vat r     ET amd maar uir Rii TI    E VIP petiides                                 in rhe      and Quality of Q                  vsus in best tn aecrh and            portal applratines    lE TIF selectively enhances ipeech formants with niemal increase In         opera                                                                          en Gee T1 53a aed  51 E   platform    DSP                  in deer Ame VIP  pas irme Pier predani           1 804 243 2272 or emand sales  matu  cam    SRSCe    VIP    areas Lae rnm  Ami     Embedded Edge    of speech especially in    noisy environments        June 2001 27       Tools    C code from a visual block diagram   but it still allows you to work with  the generated code at that level   The key to most of the framework  benefits is the hardware indepen   dent API for interprocessor commu   nications  Some multiprocessing  aspects of Diamond and Virtuoso  were influenced by Occam  the  native language of the Inmos trans   puter  which was itself an imple   mentation of Hoare   s CSP concept   The CSP notation expresses concur   rency by mathematical operators  denoting synchronized message  passing  Like other languages based  on the CSP model  Occam has a spe   cial syntax for transmitting mes   sages    to send    to receive      uses channels directly only for sys   tem processes  its application level  tasks use mailbox and FIFO objects  that the system co
9.    develop   ing and testing software on a host  platform  then porting it to the tar   get system    isn t feasible if the soft   ware must assume the presence of  the target hardware s interprocessor  communications features    Debugging  The only practical  way to debug some common failure  modes of multiprocessor systems   especially deadlocks caused by a  particular pattern or ordering of  communications  is to log the inter   processor communications  rather  than single stepping code  which is  next to useless for some problems    As mentioned  that can require  sophisticated software support in a  system built from point to point  links    The key element for solving com   munications  simulation  and debug   ging problems is abstraction of  interprocessor communications and    synchronization  Unfortunately  the  opposite happens in most DSP pro   jects  Instead of keeping the details  of the communications hardware  out of the application code  develop   ers tend to tie them as closely  together as possible  in a misguided  quest for    efficiency     Consider  some concrete examples    The SMT3xx range of modular  C6000 DSP hardware from Sun   dance Multiprocessor Technology is       different for different modules in the  range   and field the interrupt gen   erated by the DMA channel on com   pletion of the transfer    Although the code isn t particular   ly complex  it s usually slow to write   Not only do you have to completely  absorb all the low level details
10.   TMS320C6000 DSP platform can yield higher densi   ties with no assembly coding  Other technologies  require a healthy dose of optimization to reach tar     get densities     You can take steps to optimize C coded reference modems to  meet higher density targets  How high  C baseline modems  for  example  can soar from 6 per 200 MHz C6201 to 28 in four short  project phases  A fifth project phase can take the number of chan     nels to 48 per DSP    In fact  for the MSP MEDIA  Gateway line of DSP resource boards  based on C6000 DSPs  Commetrex  undertook the four phases  and the  process worked  Our MSP 320 PCI  board  with two C6201 DSPs and a  quad E1 T1 network interface  need   ed 48 to 60 channels of processing  from each DSP  For many of the gen   eral telephony stream processing  tasks  the C6000 C optimizer gave us  the densities we needed with no  assembly coding       Out of the box   C coded mo   dems  which are a reference design    16 June 2001    and written for understandability  rather than efficiency  might com   pile to  say  six simultaneous  modems  You should be able to dou   ble that by guiding the modems  through the Code Composer Studio   CCS  optimizer and by ensuring  that your memory layout takes  advantage of the C6000 s on chip  RAM    CCS includes an optimization  tutorial that provides a recommend   ed code development flow consisting  of four phases  Figure 1    A similar  tutorial is in the TMS320C6000  Programmer s Guide     Embedded Edge   
11.   help is available on  the TI Web site  www ti com   When  you download Implementing  V 32bis Viterbi Decoding on the  TMS320C6200 DSP  SPRA   444 PDF   you ll find the decoder in  very tight assembly code  You can t  just drop it in  though  You ll have to  adapt it to your environment    To make the decoder reentrant   change global variables to per chan   nel contexts and watch for bugs  You    Embedded Edge    should achieve spectacular results   A straight C coded Viterbi con   sumes approximately 150 000  cycles for 80 samples  Substituting  TI s assembly code takes that down  to an incredible 8 000 cycles  Our  V 29 receiver is now 101 429 cycles   and the V 17 receiver only  108 840    and we haven t begun to   vectorize     Using a statistical mix of modems  yields 28 simultaneous channels  In  worst case nonblocking terms  that s  18 simultaneous V 17 receivers  You  should receive similar results for  similar algorithms by usingthe opti   mizer and circular addressing     BEYOND PHASE 4      VECTORIZE       If you still haven t reached your  performance requirements  you    June 2001 19    Soft M odems       Listing 2  FIR Filter Using Circular Addressing with Hardware Support         Replacement for FIR Filter Shift    PARAMETERS     f A4    B4  g A6  8 B6      8    B8  FIR Fi       in   coefs     in   taps  base address of circular buffer      in  length  length of delay line      in  block size     in  write offset  where the next value will be written      in 
12.  FIR  PRS FE                              VOOR ESD                       eldyp    ond USEE   FHD FHMY LEH8T  efdigs                                   Soliwore Tools   C compilers  eviomSlors  linkers     Opercting Systems                 Time Foergtive in TC                  EATON    Miscellaneows      Spectrum Digital      Digitel Matar Contretars Sureaund     Tools for Sound Madulex  Protetypa Modules  DSP Development Power Supplies                   SPECTRUM                77  INCORPORATED  gia Ms EM 11 22381 n nra 81 158 TI hid Party  vorm 12307 Exchange Drive  Swite 140    TEATE 71      E7254 SORAAN  E2871 STOTT   OA                 Stattond  Tames 77477   9  M THEY    IARE TEL                   E11 5303 OIF e      117111 1  281 494 4505  a  DIEI 27 7 M              CHAR ed Pe TRI 10 ji  Bsp    F 281 454 5110           INSTRUMENTS w ww spectrumdigital com            a IAMA 737       a liii 4i d  4E A3 Pe SIRQCAPORL i5 ib URS            ii 1117111111    a          855 7 288779 a dii PD RIS DOM  dd Pall 5l Mi       eX pressD SP Compliance for Sound  Speech Techs    WOW audio enhancement technology    from SRS Labs  Inc   Santa Ana  Calif     announcement follows a similar one  regarding SRS Labs    Voice Intelligibility    www srslabs com  has passed Processor  VIP  technology  which raises  eX pressD SP compliance test    the quality and intelligibility of  ing on the TMS320C 54x and speech in voice communica   TMS320C55x DSPs  The tions equipment and speech  technology  
13.  Mass    781  272   5606  www dvsinc com    Ada Suite Teams Up with TI  DSPs for Military Service    The 83 95 Ada Compiler  an Ada  software suite that targets Texas  Instruments  SMJ 32C6000 DSP plat   form for military applications   includes a full Ada symbolic debug   ger and Ada run time options that  include tight integration with the  DSP BIOS kernel  The suite fits  tightly into Code Composer Studio   It works with SM  32C6000  SMJ   32C6201  and SMJ32C6701 DSPs  and runs on Windows 95  98  2000   and NT machines  The 83 95 Ada  Compiler costs  25 000 for the first    Audio4 5410 DSP Data Acquisition        1 888 828 3543  www domaintec com      4 Input   4  Output audio  channels  16 bit AD DA   B kHz to 48 kHz  High speed USB  interface  Real time Code  Composer  Studio   support      100 MHz  TMS312VC5 5410  with 64 Kwords  Internal RAM  WVvABSOOO  compatible         Third Part  Hater    Ose    TELAS INSTRUMENTI             wih HAP    rim                                   seat and  20 000        seat there   after  A maintenance program costs   5 000 annually  Irvine Compiler  Corporation  Irvine  Calif    949   250 1366  www irvine com    Libraries Tuned  to C6000 DSP Platform    The GD 100 DSP vector library for  the TMS320C6000 DSP platform  comprises over 100 functions and  macros  including transforms  filters   and vector operations  The GD 200  math library for the C67x consists of  algebraic and trigonometric func   tions and utilities  They sell for   3 5
14.  base   global          Filter  lter  cproc coef block taps A6 B6 A8 base     reg old amr ar coef sum1l  dl          round  offs      coef block   coefs     length 1    sizeof  short       SUB A6 1 dl   dl   length   1   SHL              41    length 1    sizeof short   ADD coef block  dl coef block   coef block   coefs   dl   SHL B6 16 B6   set block size of circular buffer   SET B6 8 8 B6   set B4 to circular mode   MVC AMR old amr   old amr   AMR   MVC B6  AMR   AMR   addressing mode   ZERO A1   sum   0      ar   write offset is where we will write the next sample    advance to the most recent word  write offset   2     ar   taps     write offset   2     SUB A8 2 offs  ADDAB taps offs taps  MV   6 41   startloop   trip 1  LDH  taps   ar  LDH  coef block   coef  MPY ar coef sumi  ADD sum1 A1 A1  SUB 4  1 1 41    dl     startloop      round and remove base      sum     sum    1     base 1      gt  gt  base   SUB base 1 round   MVK 1 one   SHL one  round  round   ADD round A1 A1   SHR A1 base A1   MVC old amr AMR   restore original addressing mode     return A1   endproc    20 June 2001 Embedded Edge    might consider going on to phase  4  changing the flow of data  through your code to reduce func   tion calls and utilize more loops  that can be optimized easily  For  modems  one approach to accom   plish that is to  vectorize  the  algorithm s implementation    The sample rate section of the  receiver consists of the following  components in series  the pulse   shaping filt
15.  boards  Good  multi proces   sor develop   ment tools let  you port the  working code  to the target  system with   out applica   tion source  changes  thus  addressing the  simulation                         DSP    TEAS             Embedded Edge    problem  provided that multiproces   sor COTS hardware is available    Multiprocessor software develop   ment on a singleprocessor plat   form  Being able to develop multi   processor software on a single   processor platform means that not  only is the application code inde   pendent of the communications  hardware that connects the proces   sors  it also can be independent of  the number of processors and their  connectivity  In particular  you can  take a multiprocessor application  consisting of separate programs for  several processors and run it  unchanged on a single processor  as  long as there is adequate memory   The software automatically relo   cates each program to a separate  position in the single processor s  memory  An RTOS kernel time   shares the processor between the  programs and transforms what were  previously interprocessor communi   cations into intertask communica   tions calls to the kernel    More interestingly  it s equally pos   sible to go the other way  Develop a  system of independent programs   processes  on one processor   for  example  a simple evaluation module   EVM  from a DSP silicon vendor     and then later distribute the process   es to separate DSPs when real target  hardware appears  a
16.  dozens is typi   cally too great an expense  Moreover   hardware based solutions may add  extra noise and slow down full speed   real time processes  skewing the true  profile and possibly preventing the  application from running correctly in  real time    To overcome those drawbacks     Embedded Edge    we ve developed a software based  statistical real time profiler for inter   nal use  It   s easy to implement and  requires no additional hardware     THE CONCEPT    The fundamental concept of the sta   tistical real time software profiler is  to take periodic snapshots of the  DSP s instruction pointer  program  counter   The captured information       shows where the DSP spends most  of its MIPS resources  Over time   these periodic  but random  snap   shots typically converge on the true  distribution function of the applica   tion  To learn how much processing  time a DSP spends in each software  function  you need to write an inter   rupt service routine  ISR  to handle  the timer interrupt and sample the  instruction pointer    You can program the internal  timer on the Texas Instruments  TMS320C62x generation of DSPs to  provide interrupts at given intervals   The actual rate of interrupts depends  on the resolution and the speed of  profile convergence you want  The  ISR must find the return vector and  record or store it for future use   Listing 1   However  the interrupts  should occur infrequently  otherwise   they ll hamper the smooth operation  of the applicati
17.  gt   hi  modulation         LUT Lookup  indexer table  Adaptive  Ei     Figure 4  DSP based adaptive baseband predistortion can  greatly simplify the design of the RF power amplifier and  enhance its efficiency  The solution shown here uses feed   back from the PA output to update a lookup table operat   ing on the baseband samples        Lie Down    Apc    erer       Embedded Edge       conversion processing in software  using high speed  DSPs or user programmable gate arrays  Some compa   nies offer FPGAs suited to that task  and      and others  offer DSPs with sufficient processing speed  such as  the TMS320C6000 and C5000 DSP platforms  Addi   tionally  numerous third party suppliers  including       Third Party Network Members  are providing custom  algorithms or development tools    Digital signal processing engines  When the IF sig   nal is in complex baseband form  you can process the  signals using a dedicated ASIC  very limited flexibili   ty   an off the shelf DSP solution  or a DSP core   or  one or more FPGAs  Again  the maximum flexibility  sought in our software programmable WLAN solution  is achieved using the DSP or FPGA option  A growing  number of high speed DSP devices can handle data  rates of several megabits per second  including the  C5000 platform of ultralow power devices for  portable use and the C6000 platform for very fast  applications       M I4110 MPEG 4 Simple Visual Profle Codec  WPAN Multimedia Application Framework      WF Er             
18.  include MAT   LAB links for Code Composer Studio    and Real Time Data Exchange and  Simulink targets for CCS and the  TMS320C6701 EVM  The kit is avail   able for Windows 95  98  and NT and  works with CCS version 1 2  Prices  start at  1 000 for an individual PC  license  The MathWorks  Inc    Natick  Mass    508  647 7589   www mathworks com    DSP Imaging E valuation Kit    A tool for building real time audio  and video compression applica   tions  the Imaging Evaluation Kit  addresses four phases of develop   ment  evaluating available technolo   gies  assessing a DSP platform s suit   ability for an application  functional  prototyping  and bringing re   designed systems to market quickly   A basic version sells for  2 995 and  includes a TMS320C6111 based  board and drivers  sample algo   rithms  and Code Composer Studio   Another version  which adds a cam   era  microphone  and speakers  sells  for  6 495  A T E M E   Velizy   France   33 1 46 01 55 72  www   ateme com    USB Emulator  for TMS320C54x    The SB USB  a self powered high     72221 emulator   connects    to a PC s  USB port  to debug  systems built   around one or more TMS320C54x  DSPs  Featuring two programmable  counter timers  it occupies a 4  x  2 5 in  circuit board and operates  seamlessly with Code Composer  Studio  The emulator sells for   3 000 and includes cables  soft   ware drivers  and a user manual   Custom versions are available   Domain Technologies  Plano   Texas   972  578 1121  www    
19.  of the  hardware before writing any code   but you must also account for   kinks  in the hardware that may  not have been fully documented     User code  application     Communications hardware            Device driver   2    Kernel    b    Device driver         9 Communications hardware      Figure 1  Device driver software provides plug in knowledge of the specific hard   ware and maps between a common  device neutral API and actual interprocessor  communications hardware  which may differ for each DSP     based on point to point Sundance  Digital Links  SDLs  implemented  by an on board FPGA  There are  various implementations of the  technology  20 Mb s versions back   ward compatible with     5 TMS   320C4x TIM 40 module comport  standard  as well as faster ones that  take advantage of more recent tech   nologies  From a software point of  view  to send data over a link  code  must correctly initialize both the  FPGA and the C6000 External  Memory Interface  EMIF  CE1 con   trol register  assign a CPU interrupt  line to the transfer  avoiding any in  use by concurrent I O operations    initialize a C6000 DMA channel to  transfer the data between memory  and the FPGA data port  the  addresses of the FPGA control regis   ters  and the bits within them  are    Embedded Edge    Thats a simple case compared  with Spectrum Signal Processings  Daytona and Barcelona  dual and  quad  C6000 boards  which use  shared PCI SRAM blocks for com   munications  The equivalent code to  s
20.  popular logic simulation  platforms and are compatible with  Mentor s library of PSPs offered by  for use in multiprocessor systems        6 June 2001    Blue Wave Systems  to Join Motorola s  Computer Group    Blue Wave Systems  Carrollton   Texas  www bluews com   a long   standing TI DSP Third Party N etwork  member  and Motorola  Inc    Schaumburg  Ill  www motorola          have signed a definitive merger  agreement in which Blue W ave will  join the telecommunications business  of Motorolas Computer Group   Tempe  Ariz  www motorola   com computer     Best known for its ComStruct soft   ware environment  which includes  the use of DSP BIO S as well as inte   grated eX              SP compliant algo   rithms  Blue W ave will continue its  operations in Carrollton and  Loughborough            Embedded Edge    I Logix Plots  Software  Component  Strategy    I Logix  Inc   Andover  Mass  www   llogix com  has rolled out a three   phase plan to build a comprehensive  development platform for writing and  reusing seamless  component based  embedded software  The strategy   which will unfold during the course of  the year  will let developers snap exist   ing software components into their  designs  similar to the way that IP  building blocks are assembled to con   struct systems on chip and other ICs     The first phase  available now in  Rhapsody 3 0  lets developers  com   ponentize  and reuse legacy software  modules that can be viewed within  the UML graphical model  The
21.  possible in par   allel  especially for MIPS intensive  loops  by providing information con   cerning the dependencies between  instructions  You can use certain key  words that give the compiler hints as  it tries to determine dependencies    Another useful technique in the  tuning phase is to use intrinsics   which are special functions that  map directly to C6000 assembly  instructions  These functions are  usually not easily expressed in C   They allow you to have more precise  control over the selection of instruc   tions by the compiler    For example  some intrinsics  operate on data stored in the low  and high portions of a 32 bit regis   ter  Consequently  if you re operat   ing on a stream of 16 bit values  you    June2001 17       Soft M odems       Listing 1  Fixed Point FIR Filter with Data Move    f 5C ke ke ke ke ke ke e e e e e e e e he he EER ER EERE EERE ck ecce Se ke ke ke ke ke ke ck Sk ck ck ck ck ckck ck ck ck ck ck ck ck ck ck kc kc k ko k      Routine Name  FIR Filter Shift    Description     Performs fixed point implementation of FIR filtering with  data move     Calling Sequence                    FF   FF                  HF HF HF    HF E          filtered sample    short FIR Filter Shift   short  taps   short  coefs   unsigned short length   unsigned short base   Where   taps   pointer to filter taps delay line  coefs   pointer to filter coefficients  which are  stored in reverse order  length   length of taps delay line  base   base of filter coef
22.  sec   ond phase will equip Rhapsody with  an intuitive visual metaphor for  assembling model based executable  code components into embedded  real time applications    In the third phase the company will  provide a Web based structure to  organize and catalogue software  components  encouraging the ex   change of design information  As part  of the plan  l Logix will also integrate  IN OTION product life cycle technol   ogy into its existing products  The  technology  acquired in March from  KLA Tencor  San Jose  Calif  ww w kla   tencor com   will let embedded devel   opment teams store  support  and  maintain design components in a cen   tral repository        M ore Breakpoints on page 33    FREE    PRODUCT    TRIALS  ONLINE       MATLAB and Code Composer Studio     ether at last           Developers      for Texas  reetriments DSP trom The       Tm       MathWorks  vou can execute Code Composer Studin    i E F P    a     i    2         M nd                             vel cli Ing drm                analyze real time dala  and automate testing om      DSPs    t         ind vou can streamline implementation bs LE           real time prototypes and Code Composer Studio projecta from Simulink    Demos  tutorials  amd trials ore available online      ee              way to develop  test  verify  and implement M AI LAB    DSP software  Visit www rmatlhworks corm eme SI    ULINK     he MathWorks    Visit ww w matlwarks com  eme    ar call 508 647 7040          CONNECTING        
23.  simplar         i  Standards for application interoperability  make integrating DSP programs simpler                    The thumb is not teehnically    finger           Products from a third party software  etwork make DSP system programming simpler    A scalable real time kernel  makes DSP programming simpler        Simplicity  simplicity  simplicity  and simplicity  right at your fingertips          x   i   l  L8 fee T CI 1   1 s i       EC          rm   N  i 1 ariora      E                                                mer                          DID  DSP   ressDISP  Rea wu Technology          Insiru DSP programming  ca  a mc Pu i    r i mb          i           Deri prm       ei ave  elie i EHE Lr i Ai   1                inu               JE OL  pst  TERIS             P cat         4 E             g      s    TENES                     i   nr           i   di I 4 5  1 J  1             Li            Li it           PHS Gea                           I LT r 1 Ker       m  a a    i      E cm   E     a F    Sapp  sapie matira             pu o hum yi m i d ed vn SP      inr te int  ronmarabikty       i    i i si i   mri ri  EE E E ic FL L      i i F    i I I      Li      eit  r i1l zl 1 FR IN k 1         i F i    rp         i           poi            r      1    i                            Ty F pami  i nl    let    anmi B    1 ing a r1      rut 1l I Imo   05    puo           m  am                       pr 1 T            Irr m    i                     i       ia    tir    1   
24. 00 and  2 450 respectively   Kane Computing  Cheshire  U K    444  0  1606 351006  www   kanecomputing com    Abr                  5 n                ree Att          eS      TH TOE                 Penn     TH        Hn      mulu ETL mad mrt lm      GAO Research hng    sili             annm                           h ream                 Ens nr    Embedded Edge       DSP    Fix  Aaa     41          T     a  aA en  heer       June 2001 37       Needed  New Compilation Tools  to Help Optimize Embedded Code    By Alan S Ward      or embedded software program   mers  the performance of code  generated from a high level language  iS becoming increasingly important   At the same time  code size is often  a critical concern  Three trends     the leaps in the capability of embed   ded processors  the significantly  greater complexity of em   bedded software  and  the mushrooming of a  variety of handheld  devices   are dri   ving a huge and  rapidly expand      ing need for tools fy  to help automate    ii  programming  In       order to keep   NER       up with these MS AP  changes  embedded ES    software programmers     clearly need a set of    accessible and easy to use  tools to optimize their code for per   formance and size demands    Many embedded compilation tools  already possess the sophistication to  highly optimize applications  The  difficulty is extractingthat capability  from the tools  Consider the task of  scheduling code for a VLIW DSP that  supports eight paralle
25. ND D A CONVERSION    Many manufacturers  including Texas Instruments  are  competing aggressively in the area of high speed digi   tal IF converter solutions  and new devices are appear   ing on the market almost weekly    Digital up and down conversion  The initial task  for the digital processing unit is to convert the IF sig   nal into a complex baseband form  down conversion   and from complex baseband to IF  up conversion    Down conversion ensures minimum sampling rate  processing for the remaining radio functions  The  tasks involve mixing  multiplication  with quadra   ture versions of a digital oscillator  In addition   a process of interpolation and decimation is  needed in the up converter and down converter   respectively  to optimize the sampling rate between  the digital IF requirements and the complex base   band requirements    Because the two functions  frequency mixing and  sample rate conversion  are common to all digital IF  solutions  various manufacturers have produced dedi   cated ICs optimized for digital up and down conver   sion  The high sample rate processing associated with  the digital to analog interface can be accommodated in  these hardwired devices  allowing the  comparatively   slower digital signal processing of the wireless signal  content to be undertaken in cheaper  lower power   software programmable DSPs    An alternative route that maintains full flexibility of  design is to implement the mixing and sample rate    Predistortion 28 p p
26. a development plat   form for the profiler host and inter   nally developed C62x based PCI tar   get boards for the target software     IMPLEMENTING  THE DSP SIDE    The ISR identifies the returned vec   tor and records it in a buffer for  future transfer to the host  A special  flagin the host or target API indicates  whether recording is enabled or dis   abled  By using the flag  the host  application can isolate specific func   tion branches of the overall target  application to be profiled  Although  enabling and disabling interrupts  could also achieve that  the method  isn t recommended  since it synchro   nizes code sections with the timer     June2001 31       a       el                 ES Puuhai dial sua    zl    21 B    mud see Eri              Pa ENT  18             The profiler   s recalculated values should be displayed in a table of at least two  columns  The first column lists the function names  and the second the percent   age value of DSP MIPS consumption  Arranging the table using the values in col   umn 2 in descending order shows the functions that the DSP spends the most  time on displayed first     The law of averages permits conver   gence only if the timer and applica   tion remain uncorrelated    Every CPU can receive interrupts  and store the return vector         spe   cific manner  some have a special   purpose register   in the C62x  device  the interrupt return pointer   IRP   The addresses can be  accessed and recorded  To prevent  the host sys
27. a preconfigured size    After the decoding phase and  matching the instruction counter  data with the map file addresses  the  recalculated values are displayed in  a results table with two or  optional   ly  more columns  The first column  displays the function name and the  second displays the percentage  value of DSP MIPS consumption   see the figure   The data should be  arranged in descending order   according to the percentage of the  accumulated run time  Thus the  function that the DSP spends the  most time on is displayed first    Our mature development envi   ronment   including a highly func   tional and robust Windows NT   based host control and monitoring  application    greatly facilitated  implementing the profiler  Using it   our R amp D team optimized the most  widely used functions in the Surf  Multi access Pool  SMP  application  for terminating the V 90 modem   G 7xx VoIP  and V 17 T 38 FolP  Asa  result  more channels were able to  run simultaneously on a single DSP  chip  For example  in the case of  Viterbi decoders consumption  decreased by one half in the modem  data pump  enabling us to reach our  target of 15 fully convergent chan   nels on a single TMS320C6202 DSP     Konstantin Merhker  kostikm surf com   com  is a software engineer for Surf  Communication Solutions  Ltd      Yokne   am  Israel  responsible for system  analysis and the optimization of Surf s  products and solutions     Jacob Bridger  joridger surf com com  is  Surf s vice presiden
28. ance s second largest telecommuni   cations equipment maker The aim of the deal is to devel   op a reference design for use in digital TV set top boxes   The design will feature PCTEL s Solsis embedded modem  for accessing the Internet and include the TM S320C 5000  D SP platform The set top boxes will be sold or licensed  to European television service providers  like France s  Canal   which will then sell or rent them to customers     Embedded Edge    Faster                 A LINUX    DSPLinux   bets embedded engineers quick    develop applications even before the hardware is                         rs ihe Linux sodution uUa 5  leverapes the power af Texas Instruments      DSP  ARM architectures                         the  pertormance leadme plattorm for wireless  mult  media and             appliances  Find out more  about the DSPLimix SDR and itx Appliance             In S                 ridgerun COIT  JI   calling 208 33   22 26 todm     OsPLinur       June 2001 33       Corner    ExpertsAnswer Your Questions    Using Code Composer  can   debug a target board  containing two DSPs of different platforms in a  single JTAG scan path   Xx In this case  you ll need to launch two separate  instances of Code Composer to support each of  the DSP platforms  Two separate directories should be  created for Code Composer files  the  set up utility will need to be run in  each of these directories  and  the DSP not beingtargeted in  one instance of Code  Composer should be  bypasse
29. arate users often requires a radio or  free space optical transmission link  Data rates in  excess Of 10 Mb s are called for  with upward of 100  Mb s a target for larger business users  In addition  a  point to multipoint network or a distributed network  iS needed for distribution from a fiber hub    For the WLAN terminal  your design challenge is to  achieve robust communications at those high data  rates  with good spectral efficiency  to maximize the  number of customers that can be served for a given fre   quency allocation   Of course  minimal setup time  low  maintenance  and a competitive price are assumed   From a manufacturing standpoint  the wireless plat   form must be easily tailored to new frequency bands as  they become available and be able to flexibly exploit  and manage the characteristics of the channel and  variable data transfer demands    Those goals are now achievable  By harnessing the  ever increasing performance power potential of mod   ern DSP devices such as the Texas Instruments  TMS320 DSP family  you can build a highly config   urable  low cost solution with minimal frequency     Embedded Edge    Software             ie             LI I    selective components  high efficiency  and small size    Unlike the cellular phone market  which is tightly  controlled by standards    GSM IS95 and third genera   tion  3G  Universal Mobile Telephony Service  UMTS      the WLAN market has no dominant air interface  standard  Literally dozens of proprietary s
30. arrier class  media gateway  SS7 signaling gate   way  cellular infrastructure  and uni   fied messaging systems  Included are  octal   1   1 network access ports   1860 signaling controller  PowerPC  750 protocol controller  high density  DSP resource mezzanine board  and  embedded software for convergent  voice and data applications  Besides  sporting VoIP  wireless  V 90  and G3  fax software for the TMS320C5000  DSP platform  SuperSpan      embeds  H 323  MGCP  SIP  TCP IP  SS7  and       TCAP ISUP signaling stacks and  internetworking functions  Prices  start at  3 000  Voiceboard  Corporation  Oxnard  Cal       i  5  805  985 6200  www   wr    voiceboard com    Assistant Enhances  Development Tools    Development Assistant for C works  independently or alongside Code  Composer Studio  The assistant uses  an ActiveX interface to communicate  with Code Composer Studio and  debugger commands  Among its fea   tures are an editor with structured  and nonstructured flowcharts  start  debugger commands for Code  Composer Studio  symbol browser   call  and type hierarchy graph     makefile generator  software metrics   interface to version control systems   project manager  and static code  analyzer  Starting prices range from   295 to  660 each  depending on the  target DSP  RistenCASE GmbH   Wallisellen  Switzerland   41 1 883   35 70  www ristancase ch       IP Phone Chip    The IP Phone  a chit       Tr   built around a 100      gt        MHz     5320  549    MNS  delivers 
31. ationship between CIO s malloc free  and MEM alloc and MEM free   DSP BIOS overrides the standard malloc and free  functions with calls to MEM alloc and MEM free  The  segment allocated by malloc is controlled by the segment  for malloc   free   Inside the MEM Manager properties     How much memory does the memory manage   ment system require    As long as no heaps are defined  no memory 15     used by the        Module  If your application  requires dynamic memory allocation  a small number  of words are required for each heap defined  Beyond  that  only memory defined as a heap is required     How           control in what memory sections   DSP BIOS objects are placed    X The DSP BIOS Configuration tool lets you place  all the objects in different memory locations   declared in the Memory Manager through each   manger module     Can DSP BIOS run in extended memory on C54x  processors   Xx Yes  the DSP BIOS Configuration tool allows you  to select the appropriate library under Global  Setting  DSP BIOS requires that the bios   sysinit  and   Vect sections be placed on the overlay  OVLY  1  sec   tion of memory  0x000000 EN 0x008000   These  sections contain wrappers to support extended mem   ory and are expected in the start up sequence  All  other sections and objects can be placed anywhere in  memory  For more information on extended memory  with DSP BIOS  go to www ti com sc docs apps   dsp tms320c5000app html and locate document  SPRA599     Embedded Edge       High Den
32. ccess with  Ultimate Solutions    Your reliable source for arommssronar  DSP dewvelapment tools     Ultimate Solutions  Inc    2 Ck Dove  Harrziion Fals  NH  03844    Telephone EB 1 3229 55855      Emai info  pullsl pam    12 June 2001    x              ERS BUE B5 353       Web  htip www ufexil corn       Embedded Edge                               EE          a               rii ae    Utf  ITAOG Ee iat ar      1 B  F LE        i L                                                                                anvil     ae d        M        m             L 22 MB     B BS 1     S T E E   I   A                     mis 34 We i  iL    4 i     E          1             USB  bus powered  Peripheral   Portable  and Compact  1 1 x 2 6 x 5 5 inches      Works with      Code Composer Studio        1MS320C5000 and C6000 Support     Quick Installation    DSP Enabling Technologies      Development Hardware               2    X    Operating Systems           Bundled Toolsets ANS            Design Services      Consulting Moi    2001 EWA       TM    For more information please visit  www  blackhawk dsp com or phone  1 877 983 4514  Cede Composer Studio fa    registered trademark of Terese                        E      lt       um mm      mm mm  s  s                      J                  Figure 3  Obtaining an instantaneous measure of the fre   quency of a signal is very difficult using analog methods   In contrast  the algorithm for frequency discrimination  shown here uses quadrature proce
33. ch of the solution space  Indeed   such a manual  iterative process is  still used to determine a satisfactory  solution  What s more  this problem  is only one example  developing pro   duction quality embedded code with                     k   Du  gil k    E  x    Embedded Edge       old compilation tools involves many  trial and error processes    Fortunately  compiler tools are  appearing that address this level of  interaction between the tools and  their users  A profile based compiler  uses criteria supplied by the pro   grammer to automatically build and  profile multiple sets of options for  coding all the software s key func   tions  then it plots the most favor   able option combinations on a curve  that represents different trade off  values between performance and  code size  Usingthe graph  program   mers can select the right trade off  for the design s requirements  like  the tightest code for a given cycle  count or the fastest execution at a  given memory size  Thus this type of  compiler can save weeks of effort  and assure developers that they  have the best solution for reconcil   ing performance with code size    Other examples of new or upcom   ing compilation tools include ones  that structure code sequences to  better map to the underlying  processor and ones that experiment  with the memory layout of code or  data to utilize on chip memory or  cache most effectively     Alan Ward is the  C6000 DSP com   piler tools manager  and a distinguished    memb
34. corresponding run time library   C code using the API is compiled  and linked by CCS s tools  as usual   Thus  just as the CCS optimizing C  compiler frees you from explicitly  coding DSP instructions  a code   based framework frees you from  coding board level interprocessor  communications at the hardware  register level    Frameworks differ significantly  in the area of system configuration   Generally  a framework has a con   figuration file that governs the  mapping of software tasks onto the  physical topology of the processor  network  Code based systems like  Diamond or Virtuoso use a text file   graphical systems like Pegasus gen   erate the file automatically from  the system block diagram on  screen    The underlying configuration  technology also varies  Virtuoso    Embedded Edge    uses the standard linker to bind  tasks  producing one executable file  for each node so that tasks on the  same node share a single namespace  for external variables and functions   Diamond  on the other hand  can  bind multiple separately linked  images for each node into a single  bootable file for the whole network   Any node can run multiple pro   grams  each with its own name   space  like a process in Unix or  Windows NT  An RTOS kernel com   ponent with full preemptive sched   uling of dynamically created threads  is loaded onto each target processor   Basic debugging takes place via  JTAG with CCS  as usual  In fact   you can view a communications  framework like Diamond as co
35. cular addressing feature of the  C6000 in your linear assembly code   It s not unreasonable to set a goal of  doubling the number of modems  from 12 to 24 in this step alone    For the most part  a modem is a  series of filters  Each filter is com   puted from a sequence of input data   or taps  and an equal number of  coefficients  A multiply accumulate  operation is performed with each  tap and a corresponding coefficient   After the computation  the taps are  shifted to make room for the new  input  Figure 2a   Circular address   ing changes the starting point for  the MAC cycle  eliminatingthe shift   ing altogether  Figure 2b     Without hardware support for this  operation  the C code for the itera   tive loop is of the form in Listing 1   The C6000 has hardware support  for circular addressing  though  By  setting the addressing mode register   AMR  appropriately  you can speci   fy the general purpose register or  registers that will be used for circu   lar addressing  as well as the size of  the memory block that will be  addressed circularly  Listing 2     Just as using the optimizer has its  challenges  so can adding circular  addressing  You might find that you  add circular addressing and then the  optimizer breaks it  It turns out the  optimizers in both CCS 1 1 and 1 2  don   t take circular addressing into  account  For example  the optimizer  will often move an address from a  register configured for circular    addressing to another register before  performi
36. d  Do the same  for the remaining DSP   Bypassing DSPs and scan  chain devices is discussed  in Chapter 1   Setting Up  Code Composer   of Code  Composer User s Guide        Should   use the interrupt keyword when imple   menting an interrupt service routine in a  DSP BIOS application   X You can t use the C compiler s interrupt keyword  in DSP BIOS programs  DSP BIOS interrupt rou   tines must be written in assembly language and must  use the HWI enter and HWI exit macros  The C6000 ver   sion of DSP BIOS has an interrupt dispatcher that  allows you to write interrupt routines in C  You can  also write a C interrupt service routine by making a  small  asm file that includes just HWI enter  call cfxn  and  HWI_ exit     Can   define my own linker command   cmd  file  instead of one created by the DSP BIOS configura   tion tool   Xx Since the Code Composer Studio build tool allows   only a single linker command file per project  the  best approach is to list the DSP BIOS linker command  file at the top of the user defined linker command file   To list the DSP BIOS linker command file in the user  defined CMD  add the following line to the top of the  file  replacing it with the actual design name        yourappcfg cmd    34 June 2001    Can DSP BIOS run on the simulator     x Yes  DSP BIOS runs on the simulator  The simula    tors currently do not contain a timer interrupt  source  so the clock  CLK  and the periodic function   PRD  are effectively disabled     What is the rel
37. dem main  006d8c44 FAX DP version  0075b9f8 FAX GetSizeOfObject    00764930        ORG Get Size Of Obj    00764900 FAX ORG init  0076a8e0 FAX ORG modem main    ranging from 5 to 30 ms    What s more  since our statistical  profiler converges over time  it isn t  appropriate for tracking MIPS  demands that occur infrequently or  during initialization  In that case   you can write special support rou   tines that loop the routines many  times until the profiler converges     IMPLEMENTING  THE PROFILER    The C62x DSP is equipped with pro   grammable timers  Some of the  timers are active and perform vari   ous functions while the application  is running  others are dormant and  available for use  To implement a  software based profiler  the DSP  must have an unused timer that is  capable of implementing interrupts    In addition  the host system  should be able to collect the record   ed entries from the DSP and decode  memory addresses  accordingto the  function names to which they  belong   To do so  the host system  must initially parse a map file pro   duced by a linker  When the system  is running  the DSP sends the    Embedded Edge    recorded addresses as packets  The  addresses point to the functions   and therefore their usage value  in  percentage points  can be updated  or recalculated  The output is a  table that includes the names of  functions and a percentage value of  CPU consumption    The process can be implemented  on any host platform  We use  Windows NT as 
38. dn t be unusual    and decent spectral efficiency   Naturally  minimal setup time  low maintenance  and a  competitive price go along  and provision for emerging  frequency bands    Those goals are achievable  says Andrew Bateman   the author of our cover story  Indeed  the power of  modern DSP devices and software can comfortably  embrace source and channel coding  pulse shaping   modulation  demodulation  quadrature frequency  translation  power amplifier linearization  receiver  dynamic range extension  calibration of the in phase  and quadrature signal components  automatic power  and frequency control  and direct digital synthesis   Unfortunately  there isn t enough room inside to focus  on everything  but you ll get    good idea of how DSPs  and software can be tapped for pulse shaping  demodu   lation  and PA linearization    Are you more interested in using C coded reference  modems  Want to stuff as many as possible onto a sin   gle chip without assembly coding  Ah  the power of  Code Composer Studio s optimizing compiler for DSP  platforms  especially when it s tickled to up the density   How dense  According to Commetrex  which went  through the optimizing process  C baseline modems  can soar from 6 per 200 MHz DSP chip to 28 in four  short project phases  Need even more  A fifth phase    4 June 2001    can take the number of channels to 48 per DSP  Get the  details from Ghassan Farah  who shows you exactly  how to go about it yourself    As powerful as DSP chips a
39. e  improved significantly  with 14 bit  100 megasample   per second  80 dB SNR converters readily available   These devices can easily support 40 MHz subsystems  using direct sampling and can extend toward 300 MHz  IF solutions using subsampling methods    The power consumption of fast converters is signifi   cant  several hundred milliwatts   and further reduc   tion of IC feature size or other advances in process  technology are required to realize the power savings  needed to allow IF sampling in handheld equipment   For fixed WLAN installations  however  power con   sumption is less critical  instead  linearity and sam   pling rate are likely to dominate the choice of fre   quency    DSP tasks for a high speed wireless Internet modem  comprise core modem functionality  source and chan   nel coding  pulse shaping  modulation  demodulation   and more advanced software radio management tasks   quadrature frequency translation  PA linearization   receiver dynamic range extension  calibration of the  in phase and quadrature  1     components of the sig   nal  automatic power and frequency control  and  direct digital synthesis  for example   We ll focus here  on pulse shaping  demodulation  and PA linearization     PULSE SHAPING    To maximize the data transmission rate over a wireless  link with finite bandwidth  you must shape the data  pulses modulating the carrier signal  For modems  using frequency shift keying  FSK   shaping tradition   ally involved Gaussian filtering  
40. en a lending       F supplier of embedded Internet protocols oF  p    less instrument        DSP                2 TMS320C3x                IMS320L4x          5320  0 7            zx     5320  62                            Rood err  Suite           Lanada        94  Sale             1  Plu        ee                             511 596 0711 ma  m     E mail  inni psti com    ww psti com          New software tools are taking aim at multiprocessor  DSP systems  particularly for the C6000 DSP platform            Ithough many of the latest top       of the line processors         extremely fast   and the 1 1 GHz  TMS320C6000 from Texas Instru   ments is extremely fast   some cus   tomers still have performance  requirements that mandate a multi   processor solution That s why many  multiprocessor C6000 systems are  already in the field    New software tools like     5 Code  Composer Studio  CCS  are helping  to speed software development  but  for the most part they re targeted at  the DSP device  Although CCS sup   ports loading and debugging multi   processor target systems via    TAG   its focus is understandably the DSP  chips  not the additional board spe   cific hardware involved in inter   processor communications    Complementary software tools  are needed to truly unlock the  promise of faster software develop   ment for multiprocessor DSP sys   tems  Indeed  software tools for mul   tiprocessor C6000 development are  now emerging  such as Diamond  from 3L  Virtuoso from 
41. end a message from one processor  to another must correctly initialize  the dedicated communications  ASIC  Hurricane   which imple   ments a point to point link and has  64 individual control registers  as  well as shared memory buffer data  to ensure that the buffer addresses  are valid in the address space of  each processor  It must chop the  data to be transferred into chunks  that fit into the SRAM bank visible  to the Hurricane ASIC and  for each  chunk  create a DMA channel con   trol program in memory to drive  Hurricane and start it  last  it must    June2001 23       Logical data processing pipeline    Single DSP EVM  development     Memory       Four DSP target board  deployment            E   d   P      Miu                    Figure 2  A four stage pipeline        be developed         single processor board   then deployed without source changes on a four DSP board for a 4x speedup     field interrupts from the ASIC when  channel program operation is com     24 June 2001    plete  either moving on to the next  chunk  which may involve copying    Are you a  Control Freak     further data into the Hurricane  SRAM  or signaling the user code  that the transfer is completed    Although the managing code is  often scattered throughout a pro   ject s application level programs  it s  obvious that the same abstract oper   ation is performed in both cases   Send this much data from here in  local memory to a receiver on  another processor    Abstraction decouples the app
42. er  the Hilbert trans   former  the demodulator  and the  interpolator  Without vectorization   the sample rate section of the  receiver processes one sample at a  time  taking it through each succes   sive section  Consequently  the  overhead of calling each filter in the  sample rate section is incurred 80  times for each 80 sample buffer   With vectorization  the sample rate  section is called once for each 80   sample buffer  An input buffer of 80  samples is then passed to the pulse   shaping filter  which produces 80  samples to be passed to the Hilbert  filter  which in turn produces 80  outputs  and so on  In the sample  rate section  the number of function  calls required to process 80 samples  is reduced from 320 to just 4  In  addition  processing the input buffer  in a loop format as opposed to sam   ple by sample allows the optimizer  to do a better job of pipelining  sig   nificantly improving efficiency    We haven t completed the vector   ization phase of this project  but we  will report the results on our Web  site  www commetrex com  when  we do       Ghassan Farah  Ghassan Farahgcomme   trex com  is manager  signal processing  technologies  at Commetrex Corp   Oration in Norcross  Ga  He has four  years    experience in designing and imple   menting a variety of DSP algorithms  His  technical interests include data and fax  modems  telephony  speech coding  and  signal classification        Delivering precisely  what you need     Since 1994  Precise has be
43. er of the  technical staff at  Texas Instruments   Inc  in Houston     a ee                  z   1   i c   T    4    4     ue   zi i 1 aus       I    mim            m sant                          5   4          Inthe past emulators provided    physical and software link between a single scan  chan target DSP RISC and a host computer  running    Debugging Interface       Until now  multi processor systems have required that devices be daisy chained  into The        scan path The longer scan path does nol allow full sean rates  because of scan path delays through the respective processors  switching in and  oll of bypass  and PCB processor infarcomneacts    wm    The xi CE  Nuli Target Emulator is capable of emulating mulbple D SPs RISCs on   separate scan palhs simullaneously  This is particularly useful for parallel   Boom applications employing multiple DSPs and or RISCs  such as image         redansongr sysiems  and processor arrays                             consists of ane PCI host card and multiple pods  ane Tor each of the    5             scan paths  This enables a numberof DSPs  RISCs  each on separate     lt  scan paths  lo be debugged from one PCI slot       at full stan rates upto MHz  all  E atthe same time       XxICE   gt  PCI Bus Multi Target Emulator           gt 100         Ethernel Emulator    s                gt  PCI Bus Emulator       dce Pack  gt   gt  ISA  16 bit  Bus Emulator                  integrated development tools make  I DSP system programming
44. er since computers became inex   pensive enough to use more than  one on the same problem  the indus   try has gained a lot of experience  about the significant software prob   lems introduced by multiprocessing    Partitioning of the problem and  its data  Getting multiple processors  to work together effectively on the  same task involves splitting the job  into chunks that can be efficiently  processed independently or with  minimum communications  Except  in special cases  partitioning is as  much art as science  Creative engi        M ultiprocessor    neering is needed here  and software  is of little help    Processor communications and  synchronization  Although some  systems can consist of multiple   completely independent activities   in most cases the processors must  exchange data via shared memory  or point to point links    With shared memory  each  processor has access to a shared  bank of memory  often through a  common bus  The processors signal  the availability of data by interrupts   Point to point links provide a dedi   cated hardware path between each  pair of processors that need to com   municate  Some DSP hardware  directly supports that model  such  as TMS320C4x communications  ports  comports  or C6000 multi   channel buffered serial ports   McBSP     Simulation  It   s hard to develop  the software for a multiprocessor  application on custom hardware  before the hardware is available and  substantially debugged  The usual  single processor solution 
45. f filter length   and hence delay and processor load  for stopband  attenuation and roll off rate  Most filter design pack   ages offer raised cosine  RC  and RRC filter options   making it easy to explore the trade off between the two  parameters    In many cases  it may be preferable to cascade an  RRC filter with a second  FIR filter  possibly half band  for ease of implementation   The second filter achieves  the desired level of out of band attenuation to meet a  given spectral mask while relaxing the requirements  on the first    and more processor intensi ve    filter    The alternative approach to pulse shaping makes  use of a lookup table  The table holds the precalculat   ed values for the pulse response of the desired filter   based on all possible input state transitions  Figure 2    The state transition is used to index the correct stored  pulse response from the chosen filter  which is then       summed with the pulse responses from previous tran   sitions to form the composite pulse shaped waveform    The lookup table method is preferred when execu   tion time is at a premium  as lookup table indexing  carries very low overhead  Conversely  the real time  filter realization of pulse shaping is used when memo   ry space is at a premium and storage of the multiple fil   ter pulse responses is impractical     DATA DEMODULATION    Efficient algorithms for data demodulation are key to  the success of software defined wireless modems  A  broad range of demodulation a
46. ficients  Returns                                     EKER                                   Kk kkk kkk kk kkk ffl li ffl     short FIR Filter Shift      unsigned  unsigned    int sum   0     taps    length   1    while  length           taps    coefs       taps 1      sum      taps              round and remove base     return    sum    1 lt  lt  base 1       can use word  32 bit  accesses to  read and process two 16 bit values  at a time    Even though phases 2 and 3 may  double the number of simultaneous  instances of the code running on  one chip  the modems are still coded  in C that   s easy to understand and  maintain     18 June 2001    short  taps    coefs   length     base     short  short  short       gt  gt  base       PHASE 4  CIRCULAR  ADDRESSING   At this point  if your performance  requirements are not yet met  you  go on to phase 4  converting MIPS   intensive portions of the code into  linear assembly code  This form of  assembly code doesn t require that  you provide functional unit selec     Embedded Edge    tion  pipelining  parallelization  or  register allocation  those tasks will  still be performed by the compiler  It  will  however  give you more control  over the exact C6000 instructions to  be used  You can also pass more use   ful information to the tools  such as  which memory bank is to be used    Modems use a number of delay  lines for the different filters  resulting  in MIPS intensive memory shifting   You can avoid that by employing the  cir
47. for the more advanced  quadrature amplitude modulation  QAM  modems  a  root raised cosine  RRC  filter is commonly employed   Whatever the pulse shaping  there are two core algo   rithms for implementation  the classical filter and the  lookup table  LUT  method    The filter approach is normally realized using an  infinite impulse response  IIR  filter for a Gaussian fil   ter shape and a finite impulse response  FIR  filter for  the RRC shape  The FIR filter realization allows the  synthesis of a near perfect RRC transfer function with  linear phase and controlled stopband attenuation   IIR  approximation of an RRC filter can be used  but in the    10 June 2001    RRC pulse  sample values   amp   0    0       RRC pulse  sample values   amp   A1  0 1  0 25    0 39                    RRC pulse  sample values   amp   A2  0 2  0 5    0 78          Optional  modulator    RRC pulse  sample values                    0 3  0 75  1 17    Figure 2  Although appearing more complex at first glance     the lookup table  LUT  method of symbol pulse shaping     using the data transitions to index the stored RRC pulse  shape  imposes little processing overhead  The output  waveform is synthesized from the summation of the pulse  samples for several preceding data bits     Embedded Edge    majority of cases  the FIR structure yields a lower   overhead algorithm than the compensated IIR design    The RRC filter shape is always an approximation of the  true Nyquist filter response  trading of
48. g 1 0 from a node  to a host system and the hardware  independence required for simula   tion are omitted in the rush to get  something working  Ironically  lack   ing simulation facilities  the project  is likely to take much longer than if  you used a tool that supports hard   ware independent communications   Such tools offer many benefits   Complete  off the shelf solutions   Solutions for communications  sim   ulation  and debugging problems are  feasible if the project 15 able to take  advantage of commercial off the   Shelf  COTS  multiprocessor hard        26 June 2001       ware in conjunction with software  tools that abstract interprocessor  communications  For example   Diamond software runs out of the  box with multiprocessor C6000  boards from Spectrum Signal  Processing and Sundance Multi   processor Technology  it includes  drivers for the different interproces   sor communications hardware used  by the boards  removing a require   ment for scarce driver development  skills from the project s critical path    Simple communications API   High level tools for multiprocessor  software development solve commu   nications problems by providing a  clean  simple API for application  code    Software deveopment for cus     tom hard   ware using  COTS boards     When you use  tools that  decouple  application  code from the  communica   tions hard   ware  you can  develop soft   ware with  COTS hard   ware  even if  the final tar   get uses cus   tom multi   processor 
49. gain with no  application software changes  Figure  1   That feature widens the range of  options for solving the simulation  problem  The same flexibility that  supports switching between single   processor and multiprocessor config   urations without recoding also sup   ports boosting the performance of  properly designed applications by  simply adding more processors   again without code changes    Development under Windows  A  corollary of independence from the  underlying communications hard   ware is that  with software processes  in a host environment like Windows       M ultiprocessor    NT  it   s possible to simulate multiple  processors during development   expanding your armory of tech   niques for developing working mul   tiprocessor software in parallel with  custom hardware development   simulation problem   The trade off  is the increased availability of soft   ware tools on the host system versus  difficulties introduced by simulating  the target system on a CPU with a  different architecture  assembly  functions can t be directly tested   for example     Framework for multiprocessor  development  As you can see  you  must watch out for a number of pit   falls in multiprocessor systems be   fore achieving application speedup   Most high level multiprocessor de   velopment tools help by providing a    Ba        1                  ready made working framework for  multiprocessor application design   The abstract model of several such  tools  including Diamond  
50. is based  on C  A  R  Hoare   s Communicating  Sequential Processes  Prentice Hall   Englewood Cliffs         1985   Soft   ware based on communicating Se   quential processes  CSP  has been  widely employed in the industry for  at least ten years  Using such a road   tested model in place of ad hoc twid   dling of hardware control registers  eliminates many design and imple   mentation errors and helps with  communications and debugging   The abstraction that produces the  benefits is almost entirely conceptu   al   you have no implementation  overhead above that of the tradi   tional hardware specific approach        other than selecting the required  hardware specific implementation  of each communications function  with a pointer rather than directly   The extra memory load required is  orders of magnitude less than the  amount of time required for the  communications itself     AN IMPLEMENTATION    How is the framework concept pre   sented to you  the application devel   oper  We ll focus on frameworks  based on writing C code  using 31 5  Diamond as an example  but most of  the underlying concepts also apply  to other high level multiprocessor  development frameworks  Note that  a graphical tool like Pegasus essen   tially acts as a    wizard    to generate       VIP dramatically improves    the quality and intelligibility    Hea ar idi differen         I SAY Labo      algarirhm        alg improses               ard      intelligibili af iesih an    Troie sana  gra     
51. is coding is a forward error  correction scheme that reduces a  modem s bit error rate for a given  amount of channel noise by adding  certain redundant information to  the channel  The information  reduces the chance that noise will  create data errors  in effect increas        First four samples Shift all Add new sample  Xs   fill the delay line samples to top of delay line  X4 X4 X5  X3 X4 X4  X2 X3 X3  X1      X      a    Before X4 Adding X4 causes A4 to  is added loop to front of delay line Adding X5  X X1 X5  X   Xo A4    gt  X    X3                 4    X4 X4    Figure 2  A new sample  Xs  is added to the delay line of a four tap filter  x4 is  the oldest sample in time  using the sample shifting method  Three shifts are  needed before xs is placed at the top of the delay line  a   Using circular  addressing  register A4  set up to be used in circular mode  automatically wraps  back to the beginning of the delay line after x4 is added and the end of the delay  line is reached  When xs is added  it overwrites the oldest available sample  x1   thereby eliminating the shifting altogether  b      ing the distance between code  points  The Viterbi decoder decodes  the Trellis sequence and determines  the most likely set of transmitted  points  However  it   s expensive in  terms of MIPS  C coded Viterbi  decoder alone took 140 percent of  the cycles that the entire V 29  receiver took  In other words  the  V 17 receiver was 2 4 times as  expensive as the V 29    Fortunately
52. l instructions  per cycle  Applying  parallelism gen   erating  transformations results not  only in faster code  but also in a larg   er code size  In the embedded envi   ronment  though  programmers are  usually limited by real time and cost  constraints    that is  by cycle counts  and code size    Because of these conflicting con     38 June 2001                Z     b  c              i AU  m               straints  most compilers for embed   ded processors contain some mech   anism for controlling the optimiza   tions that affect size versus speed   However  simply managing this  mechanism can be a daunting chal   lenge  Given the simplified situation  of a compiler switch with two   states    best performance or  wag Ji best code size   and a    very small application        PNE with 20 units of   code  such as a file  or function   220             roughly 1   7 million    combi     nations would  F exist    A more realistic  situation would  involve a compiler  switch with many  states along a size to    speed continuum and hun   dreds of code units  Obviously  pro   grammers can t search the entire  solution space for the optimum map   ping of options to code units  Instead   they typically use the 80 20 rule   which states that 80 percent of an  application s cycle requirements are  in 20 percent of the code  The exact  percentages can be debated  but the  premise is usually true  Using this  rule and knowledge of the applica   tion  programmers can quickly prune  mu
53. lgorithms are in wide   spread use  mirroring the multiple modulation formats   coding strategies  carrier and symbol timing recovery  mechanisms  and equalization methods deployed   We ll look at one demodulation algorithm for fre   quency discrimination that has widespread applica   tion and is a good example of a DSP optimized solu   tion  Frequency discrimination has two primary uses     Looking for Mr  Right    single Board Computer  strong  independent   flexible  qreen  160mm x 100mm  USB  150 MHz DSP   2 OMNIBUS sites seeks long term relationship with  intelligent end user  Call anytime 805 520 3300     Feat       Operated independently  rom a hast FL   Plug i Playj USS interface   Dual OMNIBUS L I Module uices      s       development       im wan m a          nM         Lik               LEN        E IH ENG   See                             DR          i                 a  mgprehensive f i    m RE    Tools B serva   l j  r   hifi templates  Im        LEE e Windies FMT  2000 driveri   el E TUA IDA                    PEES                         MTM                Ceo a  IND HA  E ike  uas            Application   E Nase Laecellatian     Imbedded Control  b Wide Lhanpgel Audis  b Precision Motion Cone ral       Embedded Edge June 2001 11       demodulation of the FSK family of waveforms  such  as Gaussian minimum shift keying  GMSK  used in  GSM cellular communications  and automatic fre   quency control loops  The DSP frequency discrimi   nator algorithm uses differe
54. li   cation software   the DSP algo   rithms   from the communications  hardware and contrasts with the  point solutions used on many multi   processor projects  where you try to  tackle interprocessor communica   tions by directly manipulating the  underlying hardware in the applica   tion code  Although that approach    Combine our        DSP baseboard and Servol6  OMNIBUS Module for wide channel count  servo control applications    Feature     Dedicated 160         l nating zeint            PO plgg in  amp  Stand Aione versns      Waneh            bir ADA OA    Cpmperbenstie L E a  tools A            algerithin templates     Widow NT 2000 driveri  Control                           Actuators    Other analog Iraeduicers    ate    Embedded Edge    cmosdewiopment               Hand In Hand  Power of DSP   Flexibility of FPGA       ilins Virtex 2000E FPGA    Features  Compact module  four SDB interfaces for fast 10 or for inter processor  communication  200MB  S  SDB   six built in comports  FLASH for embedded applications   Available support for PCI  CPCI  VME and VXI  Gan cascade multiple modules     SUNDANCE DIGITAL SIGNAL PROCESSING IHE   fel   775  827 3103 USA  SUNDANCE MULTIPROCESSOR TECHMOLUGY LTD            44  01934 793167 UK  Email  sales sundance com     hHip   www sundance com          Frameworks    works after a fashion  in most pro   jects time inevitably limits the scope  of the sophisticated code required to  assist development  Support for  tasks like debuggin
55. m Solutions  gjoerger cmp com    Gregory Montgomery  Director of Sales  gmontgom cmp com    Grace Adamo  Project Manager    Robert Steigleider  Ad Coordinator  rsteiglegcmp com    Susan Harper  Circulation Director  sharper cmp com    Embedded Edge is published by  Texas Instruments  Inc  and produced in cooperation with  CMP Media Inc  Entire contents Copyright    2001  The publication of information regarding any other  company   s products or services does not constitute Texas  Instruments    approval  warranty or endorsement thereof   To subscribe on line  visit   www edtn com customsolutions edge subscribe fhtml    Code Composer Studio  TMS320  TMS320C6000  C6000   TMS320C5000  C5000  TMS320C2000  C2000   DSP BIOS and eXpressDSP are trademarks of Texas  Instruments  Inc  All other trademarks are the property of  their respective owners           Inside T his Issue       Insighter  The Power of Cheese    The power of DSP capabilities can produce startling  wonders     Breakpoints 6  News from the providers of embedded systems development  products and services     Cover  Software Radio for Wireless Internet 8    With the latest DSP devices  you can design a software confi gurable   low cost modem that s very efficient and small     Optimizing C Baseline Modems 16  In just a few short phases  you can optimize C coded reference  modems to meet hi gher density targets     Speeding Development of Multi D SP Apps    New software tools aretaking aim at multiprocessor DSP  syste
56. mple   mentary to CCS  extending it with  mechanisms for simplified handling  of multiprocessor applications that  have significant interprocessor com   munications requirements    As well as sending and receiving  messages  the other primitive  required by systems based on the  CSP model is waiting until a mes   sage arrives over one of several  alternative channels  where the  source of the message is not known  in advance  similar to the select  operation in some Unix variants          alt wait n   amp chan 1   amp chan 2         amp chan         The communications channels  can be implemented with any hard   ware that allows for some form of  communications and synchroniza   tion  For example  3L previously  implemented the same API on bit   serial transputer links  TMS320C4x  comports  C6000s using PCI bus  shared memory and interrupts for  signaling  and C6000s using VMEbus  shared memory and interrupts   Applications can run unchanged on  all of them    What do you do to take advantage    M ultiprocessor    of the leverage provided by that  framework  With supported COTS  multiprocessor hardware   nothing   The implementation of the frame   work s abstract communications  primitives in terms of the supported  hardware is built into the system   For other custom hardware  you  create a link device driver for the  framework     DRIVERS FOR  CUSTOM HARDWARE    To make use of a high level frame   work for multiprocessor application  development on custom hardware   you mus
57. ms  particularly for the TMS320C6000 DSP platform     Profiler Boosts Code O ptimization    A software based statistical real time profiler can help you  dramatically improve the performance of TMS320C62x DSP code        Wizards  Corner 34    Answers to developers  questions from experts in embedded  systems development     Launchings 35  New products and services for embedded systems developers   On the Edge 38    Needed  New compilation tools to help optimize embedded code        Embedded Edge    June 2001 3       The Power of Cheese         eve all sen the commercial  A precocious waif        leaves a plate of cheese for Santa instead of milk  and cookies  and lo and behold  Santa leaves behind a  roomful of luxury goods  ranging from top line cars to  high tech electronics  The commercial concludes with  the punch line rolling across the screen     Behold the  power of cheese       Clever  the American Dairy Associations  Too bad the  semiconductor industry doesn t have the equivalent to  similarly promote DSPs  for the power of DSP capabili   ties can produce equally startling wonders    Take modems for wireless Internet devices   a tech   nical challenge if there ever was one  But base wireless  functions on digital signal processing and all of the  advantages of the digital domain accrue  and new algo   rithms eager to exploit those advantages are appearing    Basically  the design challenge is to achieve robust  communications at high data rates   10 to 100 Mb s  woul
58. ng address manipulations    When using the optimizer with  circular addressing  you might have  to experiment with a number of  alternative codings to arrive at a  solution that the optimizer respects    The new Code Composer Studio  2 0 from      supports circular ad   dressing directly from C code     You should see    significant  improvement with circular address   ing  Take our V 29 receiver  9 600  b s  as an example  After the first  three phases of our project  it con   sumed 222 188 cycles for each 10  ms of PCM data  80 samples   By  modifying just the first two sec   tions   the  pulse shaping and  Hilbert filters   for circular address   ing  we brought that down to  185 759 cycles  Changing the inter   polating and baud timing recovery  filters to the circular addressing  mode reduced it to 155 677  Finally   changing the adaptive filtering and  update routines shrank the cycle  count down to 101 429   a reduc   tion of better than 55 percent   For  a more in depth discussion of circu   lar addressing on the C6000  refer to  the      Application Report Circular  Buffering on TMS320C6000  SPRA   645 PDF       Since a V 17 receiver  14 400 b s   is essentially the same code as the  V 29 receiver but executes from dif   ferent tables  these changes cause  similar reductions to the V 17  receiver  However  we still need to  optimize the Viterbi decoder   Of  the three common modems used to  transfer fax image data  only the  V 17 modem uses Viterbi decoding     Trell
59. nstructed from  channels  But with those frame   works  calling a simple function  encapsulates all the hardware spe   cific work required on a particular  board   for example       the  Sundance SMT3xx modules  setting  up and driving the FPGA  fielding  the interrupts it generates  and con   trolling a C6000 DMA channel  and  its interrupts  to move the data  between memory and the FPGA that  handles interprocessor 1 0    A code based framework presents  its hardware independent commu   nications services as API functions    The key to most of the framework  benefits is the hardware independent         for interprocessor communications     However  for a CSP based concur   rency framework to operate suc   cessfully with standard C tools like  CCS  that syntax must give way to  function calls  In Diamond s case   the API 15 basically two functions  that operate on abstract  point to   point channels  represented by the  CHAN data type     void chan_in_message int length  void   buffer  CHAN  channel     void chan_out_message int length  void   buffer  CHAN  channel      Those primitive operations han   dle both message based communi   cations and  implicitly  synchro   nization  After calling chan_out_mes   sage  the sending thread is blocked  until the message is received   Similarly  a thread that calls chan  in message 15 Suspended until the  incoming data arrives on the speci   fied channel  Virtuoso generally    28 June 2001    in the usual way  with a header file  and 
60. ntiation and cross multi   plication to generate an output that directly corre   sponds to the instantaneous frequency of the input  signal samples  An optional envelope normalization  block removes fading components present in the  input signal  Figure 3     Mathematically  the operation of the quadrature dis   criminator is as follows  For general complex input sig   nals of the form           lo r t  xsino   Q t    15 r t  x cos  4   where r t  is the signal envelope and e t  is the angu     lar phase frequency  the signals at the outputs of the  two differentiators can be represented as           lo r  x 9 0 x cos  0   l57 t  xsin oH  Q t      lo r t  x 6 0 x sin ott    l5 7 t  x cos 0D    By cross multiplying and subtracting the signals as  shown in Figure 3  you obtain an output signal  given  by     it  x QW   I0 x Qt                  Further division by the  envelope   term yields a nor   malized real time measure of the instantaneous fre   quency variations of the input signal   In practice  it s  much more efficient to use a lookup table to generate  1 r2 t   which is multiplied with the top path signal     Unlike conventional frequency discriminators  based on a phase locked loop  PLL   the algorithm  doesn t involve a feedback process  It also intro   duces little or no bandwidth expansion into the signal   thus ensuring that the Nyquist sample rate limit is not  violated     Don t be left hanging       Without the right DSP tools for your project   Ensure your su
61. on  The Key to obtain   ing an accurate profile is to run the  application for a long time    Although software development  tool sets with profiling options  already exist  almost none operate  in real time systems  Such profilers  consume large amounts of DSP  resources  and the local system  can t synchronize with the host sys   tem because it can t manage the  real time interface under the added  burden  In contrast  DSP overcon   sumption is insignificant in our sys   tem because it adds only 0 1 percent  overhead to the main application    Software based profilers do have  limitations  though  For one  if you  program the timer to perform inter   rupts at short intervals and the time  slot is too short  the profiler dis   rupts normal operation of the appli   cation  The longer the interrupt  interval  the more time the system  needs to get an accurate picture  We  find that a 0 1 ms frequency Is suf   ficient for modem  voice over IP   VoIP   and fax over IP  FolP  appli   cations that have computing cycles       Listing 2  Example map file symbol list    007e1b58            Control  006abd3c  ExternalBufTable    007d6ba4  ExternalChannelAddress    006ec530 ExtractNum  0070ac78  ExtractRTPHeader  007daff4  ExtructChannel    007d3  68   ExtructChannelFromLargeBlockList    00743438  ExtructChannelFromList    00743490 _ExtructChannelFromSmallBlockList    0076db58 _FAX ANS Get Size Of Obj    0076daf4 FAX ANS get state valiables    0076d93c FAX ANS init  0076aa00 FAX ANS mo
62. r i m 197    roy         Li 1 i    H    LI      a  Fujj 3 ilg na esr aer IDEM LU                EETU  115    Far      aXpress DSP developers indo pack that includes iroa Code Comparer           Balian tools and ininrmation on eXpressllSP  compliant third party products  go ta www dspvillage ti com simplicity    kid T  THE Woanto LEADER IN DSP AND AWALOG      Texas    INSTRUMENTS       
63. re  bring them together         stand back    your application is liable to take off on  you  OK  not quite  but new software tools   frame   works   for multiprocessor DSP systems promise much  faster development of application software  plus a slew  of other benefits    Each one has its own claim to fame  but generally the  tools are aimed at sorting out some common challenges  with multiprocessors    partitioning of the problem and  its data  processor commu   nications and synchroniza   tion  simulation  and debug   ging  Fiona Culloch  from  3L  describes the two types  of frameworks   ones based  on writing C code and  graphical development envi   ronments  with examples   and details how the former  work    While you re beefing up  your DSP muscle  you could  deploy a software based sta   tistical real time profiler to  dramatically boost the per   formance of your applica   tion even more  According  to Konstantin Merkher and  Jacob Bridger  of Surf Com   munication Solutions  opti   mizing code to boost perfor   mance is one of the greatest  challenges in writing real   time DSP code  But if you  write a real time profiler   you can do it   to the tune  of several orders of magni   tude  Learn how inside    More power to DSPs       Stan Runyon  testman2  earthlink net       Embedded Edge        pk  4          Emulators   KDn551DPP PLUS  SAISIO 401515           M amp PI57S      GPS     Debuggari     Code Lampaser Sradin  Want                Evolvotion Madulei   C  COO
64. sity G 726 Vocoder  Available for TMS320C54x and  C6000 DSPs  G 726 vocoder soft   ware compresses 64 kb s packet  voice data for 40   32   24   or 16   MED kb s rates  It can            implement 20  channels on     COMM ETREX C5400 using 5  MIPS and up to  190 channels          300  MHz C6203  The vocoder is  available in versions that comply  with the TMS320 DSP Algorithm  Standard or MSP Consortium  M 100  as well as on the company   s  MSP Media Gateway DSP boards   Licensing fees are  20 800 for the  object code and  26 000 for limited   use source code  Commetrex  Corporation  Norcross  Ga    770   449 7775  www commetrex com    E      HU          EH E           Ex    eXpressDSP Based Library    Version 5 of SigLib  a highly portable  ANSI C source DSP library  touts  compatibility with the TMS320 DSP  Algorithm Standard  It includes  many of the low level routines used  in today s telecommunications algo   rithms and accommodates many of  the fundamental telecom operations  found in modems  mobile phones   and other network access devices  It  comes with comprehensive exam   ples and documentation and sells for   350  Numerix  Ltd   Leicestershire   U K   444  0  7050 803996  www   numerix dsp com    DSP Development Kit    The Developers Kit for Texas  Instruments Digital Signal  Processing combines MATLAB 6 and  Simulink 4 with eXpressDSP Real   Time Software Technology to simu   late  generate  and validate designs  build around TMS320C6000 and  C5000 DSPs  Features
65. ssing to recover the  instantaneous  normalized  frequency of the input signal  with very low processing overhead     PA LINEARIZATION    Pulse shaping and multisymbol modulation are wast   ed if the waveform 15 overly distorted after passing  through the analog transmitting or receiving func   tions  Designers often sacrifice linearity because  they re trying to maximize PA efficiency and power  output  so some distortion often results  The conven   tional solution   backing off the PA drive to operate  within a linear portion of the PA characteristic     wastes considerable power  It   s much more efficient  to use DSP techniques to predistort the waveform of  the signal driving the PA  in a complementary man   ner to the PA nonlinearity    The source waveform is passed through a lookup  table  which stores the correction factor for the ampli   tude  and phase  of the waveform at any given PA drive  level  Figure 4   For most applications  the PA charac   teristic isn   t stable enough  for example  temperature   supply voltage  output load  for the process to be pure   ly open loop  A feedback path from the PA output 15  therefore usually employed to allow the residual PA  distortion to be measured and the lookup table coeffi   cients to be updated    For the high bit rate solutions required in WLANs  and the LUT specific nature of the processing tasks  involved  it s often best to implement adaptive predis   tortion using dedicated ASICs or FPGAs     14 June 2001    A D A
66. t of corporate market   ing  He has 15 years  experience in R amp D  and management of D SP based real time  embedded software and signal processing  projects   spending the last several years in  global high tech business development           Continued from page 6    DSP Helps Kodak With  Upgradable Product    A TMS320DSC21 DSP from Texas  Instruments  Inc   Dallas  Texas   www ti com  sits at the heart of  Kodak s mc3  a multifunction con   sumer imaging and audio product  that captures video  still images  and  audio  The chip lets Kodak customers  upgrade the product  via software downloads from  the Web  with the latest audio and video compres   sion formats  The mc3 can record video at 20  frames s for the highest resolution or 10 frames s for  virtually unlimited recording to removable memory  cards  It also captures still color images having VGA  resolution and can store up to 90 minutes of MP3  music on a 64 MB CompactFlash memory card           tech innovations        TI Launches On line  DSP Newsletter    Texas Instruments  Inc   Dallas  Texas  www ti com   has started an on line monthly newsletter called e   Tech Innovations  Digital Signal Processing Edition   Readers can subscribe at www ti com sc docs dsps   etechdsp htm for an easy way to keep informed of  the latest DSP news and trends from TI     PCTEL and Groupe  SAGEM Collaborate    PCTEL  Inc   Milpitas  Calif   www pctel com  has formed a  strategic alliance with Groupe SAGEM  Paris   www sagem com   Fr
67. t write a device driver that  maps from the framework s abstract  communications API to the available  hardware  Figure 2   Three ap   proaches are possible  porting kits     fee based service  and IP licensing    With the first approach  you write  the driver based on a porting kit  provided by the framework vendor  that documents the interfaces  between the driver and the rest of  the system  The porting kit must  contain sufficient software compo   nents so that you can link your cus   tom drivers into the system    With the second  you go to the  framework vendor  In most cases   the vendor has a great deal of expe   rience writing drivers that interface  their software to the COTS boards  that it supports and is willing to  write custom drivers for a fee    IP licensing is an intermediate  approach to speed up driver devel   opment based on a porting kit  You  license working source code for    logy    Achieve BIG sound fun  d  speakers or headphones     amperessed srenes qud Ties       E SAS Lal WOW algerthm sagnificamh improse the qaakty af digitali    E WOW produces a mach vider and taller sound mage field and deep rch bass  M OW techoolegy is perfectly waited for mam popular pradeti            i                           MP 3 pier peril pari ptr cade Prodan rs Interne           AIMS frien             and wineleri deiner  gare consoles and                        WOW is et pre         comgllanr on Dar re T 541 and 538        parher     DSP       Context us f   coe aw  
68. tem from being overbur   dened with profile data  you should  use a buffer for recording addresses  and transfer them to the host in  extended time cycles  Our address   es  for example  are transferred to  the host every 10 ms    When the C62x DSP target soft   ware is compiled and linked  several  files are produced  one of which is a  map file  The map file contains the  entire target memory map  includ   ing symbol information  such as  global variables and function  names  and the addresses of the    32 June 2001    symbols relative to the physical  memory map  Listing 2   The sym   bols are arranged in ascending  order  so that the last address of a  function is the one before the first  address of the next function  Once a  map file is produced  initial parsing  can be performed    After recording the vectors and  receivingthe buffer of addresses  the  host begins the decoding process  It  selects each address from the buffer  and finds the specific function it  belongs to  relative to the memory  map   It then increases the access  counter of that specific function   enabling the profiler to recalculate  the relative DSP MIPS consumption  value of the function  The value is  calculated by dividingthe number of  hits for a specific function by the  number of received addresses    The host system periodically  receives packets of recorded  addresses through the DSP or host    Embedded Edge    interface  Therefore it must be able  to receive blocks of data according  to 
69. the 80 20 rule  Translated into soft   ware terms  it stipulates that 80  percent of the DSP resources are  used by less than 20 percent of the  code  For our highly complex sig   nal processing applications  the  rule is closer to 95 5  That phenom   enon is extremely encouraging  It  means that isolating and optimizing  the voracious 5 percent of code  reduces DSP MIPS usage and boosts  application performance    You can develop finely tuned  tight  DSP code by identifying the few sec   tions that overextend the MIPS bud   get  Onetool commonly used for that  purpose is a hardware profiler that is  part of or added to an in circuit emu   lator  A hardware solution  however   has two drawbacks  expense and    30 June 2001    By Konstantin Merkher  and Jacob Bridger          Listing 1  Example ISR code to service timer interrupt for C62x       extern cregister volatile unsigned int IRP        Interrupt Return Pointer       volatile unsigned short int profiler enable flag 0     unsigned int profiler program counters  MAX DATA BUFFER SIZE         Record Buffer       unsigned short int profiler program counters index 0        record Index       void interrupt    intXX        if  profiler enable flag        if  profiler program counters index  MAX DATA BUFFER SIZE          profiler program counters  profiler program counters index    IRP                obtrusion  Buying the tool for one  development station may not be a  major expense for a typical company   but purchasing it for
70. the features of    standard  PBX based phone  including call  transfer and caller ID  Accom   panying software includes G 723 and  G 729 voice compression and auto   matic echo cancellation algorithms   as well as IP  UDP  RTP  and DHCP          New ActiveX Plug In for  Code Composer Studio     W Adds Flow Charts  Call  and  Type  Hierarchy Graphs and  Software Metrics     a pupa software  development productivity  up to 50    RislanCASE GmbH    Tielackeretrasse 198304 Wallisellen  Switrerland             Ristan CASE chida    DSP    agas       36 June 2001 Embedded Edge    TORNADO E SERIES    The Choice For Embedded DSP  Available with       8320           5       or  TMS320VC33  USB Dual RS232 422 GPIO   739 to  1 854 each    x           pi Re                                                            www microlabsystems com       network protocols  SIP signal proto   col  and DES encryption  Prices start  at  300 each  ADtech  Boncelles   Belgium   32 4 338 13 30  www   adtech com    Low Data Rate  AMBE  2 Vocoder    A low data rate AMBE 2 vocoder  operates at 2 0 to 9 6 kb s for appli   cations where low bandwidth and  high quality speech performance  are high priorities  Its model based  multiband excitation algorithm car   ries distinct advantages over con   ventional CELP based vocoders   including higher mean opinion  scores  The software runs on the  TMS320C5000 DSP platform  The  price depends on customer specifi     cations  Digital Voice Systems   Inc   Burlington 
71. three phases focus on utiliz   ing the optimization abilities of the TMS320C 6000 compiler to achieve high code  performance while maintaining the code in C  The last phase involves linear  assembly coding of those portions of the code whose performance needs to be  improved further   This figure is based on the one on p  1 4 of the TMS320C 6000  Programmer   s Guide      Embedded Edge       200 MHz C6201  Always maintain  the C coded baseline as your refer   ence code  Because it s often devel   oped very straightforwardly  leav   ing it as a reference will be valuable  if you have to diagnose a problem   Make your improvements there   then factor them into the opti   mized version to produce a bit   exact version    Phase 2 involves compiling using  the optimization options  The opti   mizer  combined with a judicious  memory layout  can more than dou   ble the number of modems on one  chip  Allocate a few weeks for the  effort  But note that the optimizer is  capable of  breaking  the modems  in a few places  so you may have to  modify some pieces of the C code   Also  you may find that some of the  changes you make to the C code to  improve the optimizer s results  when using CCS 1 1 aren t required  when using the release 1 2 optimiz   er  therefore move to 1 2 if you can    In phase 3  you tune the C code   There are a number of techniques to  refine your C code and greatly  increase its efficiency  The goal is to  allow the compiler to schedule as  many instructions as
72. which is embed       SRSCO   synthesis equipment  such as    ded in Microsoft s W indows Media Player  7  improves the dynamics and bass per   formance of stereo audio played through  small speakers and headphones  The    conventional and cellular phones  VoIP  devices  headsets  and digital radios  VIP s  small footprint fits easily on top of exist   ing voice coder or processor applications     Linux kit for TI DSPs debuts    RidgeRun  Inc   Boise  Idaho  www ridgerun    com  has unveiled the DSPLinux Software  Development Kit  SDK  for the OMAP1510    RIDGE RUM    and TMS320DSC21 processors from Texas    Instruments  which combine an ARM7 or ARMO core with the TMS320C5000  DSP platform  Now available in beta  the kit includes a 2 4 version of the  Linux operating system optimized for embedded devices  standard GNU    development tools  and the Desktop Simulation Environment     Mentor  TI Team  for Coverification    Mentor Graphics Corp   Wilsonville   Ore  www mentor com  and Texas  Instruments  Inc   Dallas  Texas   www ti com  have agreed to deliver  coverification Processor Support  Packages  PSPs  for     5 DSPs and  microcontrollers  The PSPs are  based      TI s present instruction set  simulators and connect to Mentor  sfm Graphics    Seamless  TELS Co Verification      Environment  through an adapter layer  PSPs  included in the agreement model the      5320  27      54      55            C2000 DSPs  as well as the ARM925  microcontroller core  They work  with all
73. ystems are  in play  This situation gives you the flexibility to work  with the latest modulation  coding  access  and equal   ization formats  thus squeezing every last drop of  capacity out of the channel  However  the manufactur   er has to build a degree of flexibility into its design  or  develop multiple versions  so that the product can  operate with multiple air interface standards across a  range of service providers and network types    Achieving flexibility in a wireless platform  requires three key features  minimal frequency   selective components  high linearity to minimize  self induced signal distortion  and maximum soft   ware defined functionality    The core building blocks of a modern software   defined digital radio are a linear power amplifier  PA    a wideband  low noise front end  a good synthesizer   and  of course  a high speed DSP engine and data con   verters  Figure 1   For maximum software configurabil   ity  the analog signals should ideally be converted into  digital form at the RF carrier frequency  Unfortunately     June 2001 9       the conversion isn t viable at frequencies much above  500 MHz  with most WLAN spectrum allocations in the  5 GHz plus range  an analog intermediate frequency  stage is still required  The digitization frequency used is  governed by component costs  the linearity  speed  and  dynamic range of D A and A D converter technology   and power consumption constraints    Both the linearity and the speed of converters hav
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
568003 Xeno IV - multi  [U4.63.34] Opérateur REST_SPEC_TEMP  RMR-C495 - Rocky Mountain Radar    Designers - Great Treasure Group  R&S FSV Analyseur de spectre et de signaux  USER`S MANUAL - Icon Heath & Fitness  HYDRO BASIC    Copyright © All rights reserved. 
   Failed to retrieve file