Home
        On Real-Time Systems and Processor Architecture
         Contents
1.   ER      3 11 THOR HSO configuration    In the proposed configuration  THOR  25 MHz  does not require any wait state so   W   0  U   1 95 leading to Y U  W    0 51 and     Zi  1  Z2 1  Z3 2  Z4    finally   1 1    ER   14 3 MmizedI PS      175 40 ns    3 12 SPARC HSO configuration    The SPARC configuration utilises a 64 kByte cache memory  Experience has shown that  for a cache of this size  a hit rate of 90   is probable  Denoting a 32 bit word fetched    from the cache Z  C   we write        ERE    4 21   Z2 2   Z3  3   Z424  0 10      A1 C ay   Z2 C ax2   Za C z3   Za C za  0 9    Timing analysis shows that a cache miss will cost one wait state  An access whithin    cache may be done without wait states  Hence     21    and     The HSO configuration runs at 40 MHz and from this     1 1    ER                R 1 735 25 ns      23 MmizedIPS    3 13 Summary of results    As shown in table 3  the HSO designs clearly favour SPARC  This is not very suprising  because the SPARC CPU is available in a 40 MHz version and offers an architecture  designed for single cycle execution of instructions  The figures of power requirement and  the required board area indicates the price for this superior performance     Table 2  however  gives another picture  The restrictions imposed on the real time  system configuration degrades total SPARC system performance notably  Here it is  comparable with both THOR and T800  The explanation lies in the absence of cache  memory  and the presence of 
2.  Event handling    By    normal flow of instruction execution    we generally mean the execution of sequential  instructions in memory  JUMP  BRANCH and CALL instructions  in short an easily  predetermined behaviour from the computer system  A break in normal flow of instruction  execution is an event of some kind  such as     e An interrupt  normally caused by an external device pulling a dedicated pin on the  processor active     e An exception  caused by the execution of an instruction preventing finishing execu   tion of the instruction  Examples are  Arithmetic faults  divide by zero  attempt to  draw the root from a negative number etc   violation of permissions such as attempt  to access supervisor memory in user mode  attempt to execute privileged instructions  etc     e A trap  caused by a special instruction and providing method of implementing op   erating system calls etc  A trap may be conditional such as TRAP on OVERFLOW  and used in conjunction with arithmetic operations     In real time systems an external event should affect the internal state of the system  and or get some kind of attention  Hardware support for event handling is provided by the  processor   s interrupt mechanism  All of the studied processors treat interrupts in a similar  manner  The elapsed time between an interrupt and the point at which processing starts  at the appropriate interrupt handler address can be regarded as the interrupt latency time  and is divided into three phases     1  F
3.  and  correction  EDAC  unit  for check of processor communication with memory  on chip     11    2 5 Summary    The large register file present in several of the studied processors allows optimizing com   pilers to arrange for fast subprogram calls by passing parameters in registers  When a  large register file is available there is a good chance that all  or most of  the parameters  could be passed this way  The MC88100 and R2000 are good examples  Both archi   tectures provide large register sets and the usage of these registers could be optimized by  a compiler  The drawback here comes in the case of nested subprogram calls  only the  highest program level can take full advantage of this construction  With a register window  design  as in SPARC or Iapx80960  it is possible to increase the number of program  levels that will benefit from parameters passed in registers  However  the fundamental  problem remains since even very large register files may be exhausted  A stack architec   ture such as T800 or THOR provides a natural convention  stacking of all parameters   This is simple and straightforward and causes no penalty on nested calls  Furthermore   with THOR  since the 32 bytes close to top of stack are present in on chip registers it is  possible to take advantage of the rapidness with register passing without having to bother  with save and restore in the case of nested calls  Am29000  finally  provides a solution  similar to SPARC  The large number of registers and
4.  on chip timer facilities  Am29000  T800 and THOR     Real time systems are used to maintain surveillance and control processes where a  system failure might have disastrous consequenses  Nuclear plants  aircrafts  spacecrafts  just to mention a few  In the years to come we will see even more applications with    12    steadily growing demands for reliability and security  Consequently hardware software  debugging support and fault tolerance are also important parts of real time system design   All of the processors provide some kind of software debug support  Furthermore T800  provides facilities that makes real time debugging possible to a limited extent  Built   in fault tolerance support such as selfcheck  memory error detection  and correction  is  provided only by THOR while MC88100 and Am29000 provides support for redundant  designs     3 Real time system hardware designs    A physical real time system  when used in aerospace for example  must meet some im   portant needs  It should be small in size  have low weight and low power consumption   The system should be reliable and thus only high quality components  at least military  qualified  should be used  Fault tolerance support is desirable and memory errors must  be detected and preferably corrected   See  Tor90  for a thourougly description of re   quirements on microcomputers in critical applications   The purpose with this chapter  is to highlight how demands on system hardware impacts on system performance and  
5.  the use of a run time stack made  up by registers could be thought of as register windows where the calling and the called  program share a set of registers     In hard real time systems fast rescheduling is of great importance  Process switches in  real time systems can be a time consuming matter  Moreover  since processes are created  and removed dynamically it becomes very difficult to predict the time spent on these  activities  In analyzing the processor   s ability to perform fast task switches the important  observations are     e The register file should be reasonably sized since a task switch  process switch  re   quires the entire processor context to be exchanged     e Hardware support for task switches is an essential feature to reduce the time spent  for rescheduling     A large register file will delay processor context switch significantly  Therefore  a  large register file  which has proved essential for increase of system performance could  become a bottleneck with unpredictable consequenses  From above we conclude that a  stack architecture  such as T800 or THOR  with hardware support for process switches  provides considerably better performance than any of the other processors     In applications where speed is far beyond human control and the tolerances are small  there are often needs for precise time handling  i e processes that require a precise delay  should get that delay and nothing else  Three of the studied processors addressed these  issues with
6.  uses eight general purpose registers  software convention  for param   eter passing  The responsibility for saving these registers contents during nested subpro   gram is laid upon the compiler     The Iapx80960 provides sets of 16 local register for each subprogram  There are 4  sets of these registers on chip  If a nesting depth larger than 4 is used  the processor  automatically saves the local register contents on stack  thus freeing local registers for  use by the subprogram  Parameters are passed using the global registers which are ac   cessible regardless of which local register set is currently active  thus 15 parameters could  conveniently be passed to  or from  a subprogram and nested calls requires stacking of  parameters     The Am29000 utilises a large  192   on chip register set which is organized as a  run time stack  When a subprogram is called  a new activation record  or    stack frame     is allocated  This record includes local variables  arguments to the subprogram and a  return address  A compiler targeted to the Am29000 should use two run time stacks for  activation records  one for often used scalar data and another for structured data and  additional scalar data  The scalar portion of the activation record can then be mapped  into the processor   s local registers  because of the stack pointer addressing which applies  to the local registers  Since activation records are allocated and de allocated within the  local registers  most procedure linkage 
7. 2      1 1 Event triggered systems    In an event triggered system the software consists of a real time kernel and the application  programs  The kernel is responsible for process synchronisation and communication as well  a scheduling of processes  application programs  in the system  Furthermore  the kernel  often handles input output from to peripheral devices by means of hardware interrupt  facilities  This provides for rapid respons to external stimuli  events  by the use of special  interrupt handling  By the use of an appropriate scheduling algoritm the kernel dispatches  the CPU to the process that most urgently needs to execute     1 2 Time triggered systems    Similar to an event triggered system  a time triggered system should respond to external  stimuli  In a time triggered system however  the event is not sampled momentary with the    real time event  Rather  the tame triggered system checks for real time events at regular  predetermined intervals  During each interval an input device that reflects the event  is  read  Note the distinction between a real time event and its projection  i e the time it  becomes known to the system  Obviously these intervals must be constructed to guarantee  that all hard real time requirement should be met  Consequently the event signal has been  moved from a hardware interrupt mechanism to a software polling mechanism  By remov   ing hardware interrupts and software interrupt handling  time triggered systems provide  us with a fu
8. 26114   26020   36190 Total Power Requirement  mW   119576   104767   169453   Failure Intensity  FITS        Table 3  Summary  general purpose system configuration    3 1 General notes on the designs    For each design a memory read cycle was analyzed and results were used in the performance  evaluation     Estimations were performed using worst case assumptions  The designs were opti   mised for the highest possible clockfrequency i e no attempt was made to reduce wait state  penalties due to high clock frequence     For both configurations the following instruction mix was chosen     e 50  arithmetical logical instructions  e 25  jump branch instructions  e 10  load store instructions    e 15  floating point instructions    3 2 Execution rate estimation    The instruction mix was made up from     14    e z     percentage arithmetical logical instructions     z    percentage jump branch instructions     x3   percentage load store instructions    e x4   percentage floating point instructions  Parameters that describes the processor in effect were     e X   the number of processor cycles required to execute an arithmetical lo gical in   struction    e X    composed by   0 1X21   0 9X22 where        X is the number of processor cycles required for a    branch not taken    in   struction        Xo is the number of processor cycles required for a     branch taken    instruction  Hence  it was assumed that 90  of all conditional branches are taken     e X3  denotes the number of 
9. OR has an on chip timer as well as a built in EDAC  The chip was not available  at the time for this investigation and actual figures concerning the THOR chip were  obtained from simulations in a Genesil Silicon Compiler  According to these simulations   the clock frequency would be 15 MHz  assuming components satisfying military range  requirements  It was found that one wait state must be inserted during each read memory  cycle and the following parameters were chosen to describe the THOR configuration     X  1  X2  1  X3   2  X4 4    95  of THOR instructions are encoded in 16 bits  the rest are encoded in 32 bits   hence U   1 95 and with W   1 from above     Y W U    1 03  Thus   Z     Y W U    1 03  Z2   Y W U    1 03  Z3  3  Z4   X4 4  leading to   1 1  ER               8 9 MmizedI PS  1 673 67 ns we  For the memory activity  AMA   0 410    which gives  326 mW   device     3 8 SPARC HDO configuration    The CY7C601 chip available in military specification range is running at 25 MHz  This  configuration requires that two wait states are inserted during a memory read cycle  The  following parameters were chosen to describe the SPARC configuration     X   1    19    X2 1  X3 3  X4 4    A SPARC instruction is encoded in 32 bits so U   1  From above W   2  and     Y W U  3  thus   Z   Y W U  3  Z2  Y W U  3  Z3  5  Z4   X4 4  leading to  1 1    ER   7 5 MmizvedI PS      3 35 40 ns    The memory power down facility may not be used since it is not possible to deassert  memory 
10. On  Real Time Systems  and  Processor Architecture    Roger Johansson  Department of Computer Engineering  Chalmers University of Technology  5 412 96 Goteborg  Sweden    E mail  roger ce chalmers se    June 22  1993    Abstract    This report discusses the impact of hard real time systems requirements on mi   croprocessor performance  Certain dependability aspects are alse considered although  not covered in detail  Therefore we discuss hard real time systems and micropro   cessors from an architectural point of view as well as system hardware design  The  architectural considerations assume an event triggered hard real time system with ker   nel software  The hardware considerations treat a space qualified computer system  compared to a general purpose application    Hard real time systems are intended for use in environments where dependability is  a primary design goal  For the majority of common microprocessors  high performance  has been the primary design goal  However  a primary design goal such as high per   formance introduces conflicts with a design goal such as dependability  It is also clear  that a hard real time system implementation that utilizes a high performance RISC  CPU does not necessarily benefit from the high execution rate that the microprocessor  offers    Keywords  Hard real time systems  dependability  microprocessor architecture    1 Introduction    An important field of computer exploitation is real time systems  A real time system can  be unders
11. an EDAC which prevents the system from gaining from the  benefits that the SPARC architecture offers  At the same time the expected failure rate  and the total board area required are considerably larger than for THOR  The power  requirement more than doubled compared to both T800 and THOR     3 14 Conclusions    The system hardware considerations indicate that in a real time system design there is  not very much to gain with a modern  general purpose RISC design  On the contrary   while the estimated performance for SPARC was just about the level of THOR  the  board area became approximately 40  larger  the power consumption 70  higher and the  expected failure rate became 45   higher     22    4 Concluding remarks and Future work    Hard real time systems are intended for use in environments where dependability is a  primary design goal  Examples of such environments are  spacecraft  aircraft  nuclear  plants and various military applications  It is clear that the probability of a computer  failure causing an accident must be kept as low as possible since any accident in these  contexts very well may cause severe human injuries     For the majority of common microprocessors  high performance has been the primary  design goal  Most certainly  high performance is desirable even when it comes to hard  real time systems  However  a primary design goal such as high performance introduces  conflicts with a design goal such as dependability  For example  pipelined architectures  a
12. assumptions were made    e Quality Factor   S  0 25    e Voltage Factor   1    e Application Environment Factor   Space Flight  0 9     17    The T800 and SPARC designs both utilise an    error detection and correction unit      EDAC   The introduced delay  36 ns  worst case for the EDAC in use  is inserted by the  EDAC control and assures that memory     Ready    signal will not be asserted until correct  data is guaranteed  THOR has a built in EDAC so there was no need for this unit in the  THOR HDO configuration     3 6 T800 HDO configuration    T800 chip running at 17 5 MHz is available in mil spec  Since the T800 has an on chip  timer  no such peripheral device is required  From the read memory cycle analysis it was  found that three wait states has to be inserted  The following parameters were chosen to  describe the T800 configuration     X  2  Xo    2  Xy  4  X2   3 8  X3 2  X  8    The manufacturer claims that about 70  of executed instructions are encoded in a  single byte  Inm89  p 195  From the current instruction mix we assume that 50  of the  instructions are encoded in 8 bits  30  of the instructions are encoded in 16 bits  the rest  are encoded in 32 bits  This gives U   2 and with W   3 from above we have     Y W U   2  Thus    7    X1   2   Zy   X2   3 8   Z3  5   Z4   X4  8   leading to   1 1    ER   4 8 MmixredI PS      3 65 57 ns  For the memory activity we obtain     AMA   0 18    which gives  189 mW  device     18    3 7 THOR HDO configuration    The TH
13. ble     Control transfers to appropriate exception handler     The T800 and THOR treatment of hardware interrupt as a synchronization primitive  may be used to implement very fast process switches  This subject will be treated in the  next paragraph     2 3 Process switch    In a real time environment each program under execution constitutes a process  Another  name for a process is a task  both terms will used here  For each process there must exist     e A Process Control Block  PCB  used by the operating system to maintain the pro   cess  Entries in the PCB may also be used by the process itself     e Data Space  where the process data resides     e Code Space  where the process code resides  May in some cases be shared by several  processes     In addition to this we must add the procesor context to fully describe a process at any  time  A processor   s context is characterised by     e Accessible register contents  e Internal  unaccessible  register contents    e Processor internal state    During a context switch  at least the processor internal state and the internal register  contents must be preserved  or the processor must be allowed to proceed until a well defined  state is reached  For example  the current instruction is allowed to complete  Furthermore   to allow restart of the interrupted program  the status register  stack and program counter  must be saved  For a process switch  obviously the entire processor context must be saved  which also includes the acce
14. can occur without external references  Also  during  procedure execution  most data accesses occur without external references  because the  scalar data in an activation record is most frequently referenced  Activation records are  typically small  so the 128 locations in the local register file can hold many activation  records from the run time stack     R2000 uses four general purpose registers  software convention  for parameter passing   The responsibility for saving these registers contents during nested subprogram calls is laid  upon the compiler     Cypress SPARC utilises a set of 136 registers where 32 general purpose registers   divided into 4 groups  are visible to the program  The     outs     8 registers  in the active  window are are identical to the ins of the next window  The out register r 15  is used for  saving current address by the CALL instruction  Thus seven parameters may be passed   using registers  during a subprogram call  By software convention  fewer parameters can  be assumed thus providing additional local registers  If a nesting depth exceeds 4  a trap  occurs and the real time kernel must take approriate actions     Both T800 and THOR are stack architectures  Consequently parameters are passed  via the stack  Furthermore  in THOR  32 words from Top of Stack and downwords are  reflected in registers on chip  A writeback mechanism provide for consistency with memory  contents  The writeback is simultaneous with other processor activities     2 2
15. chip select during interlocks and so the total memory power requirement is 650  mW  device    3 9 The HSO configurations    The HSO configuration is intendeded to estimate peak performance for a general purpose  computer  It consists of a microprocessor with 1 MByte of static random access memory   The HSO configuration is accomplished by eliminating the EDAC circuitry and changing  the memory devices from the HDO configuration  Glue logic  except from address decoding  and bus buffers is implemented using macro cells  The memory is built from eight 64k 16  bit  25 ns static rams  Since the used memory does not facilitate a    stand by    power mode   the memory power requirement is fixed  Address decoding is performed by high speed  PAL devices  eliminating any address bus skew which otherwise may arise in high clock  frequency systems  Failure Rate Estimations assumes commercial quality components and  a    Ground  benign    environment     3 10 T800 HSO configuration    From the T800 read cycle analysis  and with the chosen configuration  we conclude that an  external memory read cycle may be performed without wait state penalty  This also implies  that there is nothing to gain from a cache memory  It should  however  be emphasised  that the T800 internal memory  4 kByte  is not considered     20    Hence W   2  U   2 leading to Y W U    1 5 and     Z  2  Z2   3 8  Z3  A  Z4 8    The HSO T800 configuration runs at 30 MHz and thus     1 t   8 5 MmizedI PS    3 55 33 ns     
16. cuting one machine  instruction at a time and then returning control to some debugging tool  In an event  driven real time system  a more extensive support would be desirable to catch transient  erronous behaviour resulting from special occurances of events  The environments in which  real time systems mostly reside and the tasks that they most often perform makes con   tiguous service or service during operation difficult or impossible to carry out  This makes  hardware debugging facilities and fault tolerant aspects central in real time system design   The following summarize the processor   s support for timer facilities  software hardware  debugging and fault tolerance     MC88100 can be forced to a    serial mode     disabling the pipe line  by setting one  bit in the status register  This  significantly reduces machine throughput but is useful  for debug purposes  Besides from that  software debugging must be accomplished by the  use of general trap handling facilities  The processor include   s comparator circuits at the  output to support fault detection  There are several possible configurations possible for  master checker operation and other redundant designs     10    To support debugging systems  the Iapx80960 provides a mechanism for monitoring  processor activity by means of trace events  The processor can be configured to detect  seven different trace events  including the instruction execution  branch events  calls  su   pervisor calls  returns  prereturns an
17. d breakpoints  When the processor detects a trace  event  it signals a trace fault and calls a fault handler     In Am29000 software debug is supported by the trace facility which guarantees exactly  one trap after the execution of any instruction in a program being tested  This allows a  debug routine to follow the execution of instructions  and to determine the state of the  processor and system at the end of each instruction  The processor has a built in timer  facility which can be configured to cause periodic interrupts  The timer facility consists  of 2 special purpose registers   the tamer counter and the timer reload registers  which  are accessible only to supervisor mode programs  The timer facility may be used to  perform precise timing of system events  Each Am29000 output has associated logic  which compares the signal on the output with the signal which the processor is providing  internally to the output driver  The processor signals situations where the output of any  enabled driver does not agree with its input  For a single processor  the output comparision  detects short circuits in output signals  but does not detect open circuits  It is possible to  connect a second processor in parallel with the first  where the second processor has its  outputs disabled due to the Test mode  The second processor detects open circuit signals   as well as providing a check of the output of the first processor     The R2000 instruction set includes a BREAK instruction whic
18. dependability     This chapter discusses six computer designs that use the Inmos T800 Transputer  the  Saab Ericsson Space THOR and the Cypress SPARC microprocessors respectively in  order to evaluate hardware aspects of the three processors in two different configurations     e A Real time System application  called the High Dependability Oriented configura   tion   HDO   The HDO configuration should be thought of as an on board computer  for a spacecraft     e A general purpose  embedded  system application called the High Speed Oriented  configuration   HSO      The designs  which not are realised  are considered comparable at cost and analyzed  to give an estimation of     e maximum possible instruction execution rate  e required number of devices   e area of printed circuit board   e power consumtion    e failure rate    13    The results are presented in Table 2 and Table 3 and the rest of this chapter briefly  describes the method that was used in obtaing these figures  For a thorougly discussion  on this subject see  Joh92      mopman o a SA    Clock Frequency  MHz   Mixed instruction execution rate  MmixedIPS     Number of required devices  10307   7844 Total area for devices  mm2    Total power requirement  mW    Failure Intensity  FITS        Table 2  Summary  real time system configuration    EOE    Clock Frequency  MHz   s5 14 3 33 0 Mixed instruction execution rate  MmixedIPS   21 19 23 Number of Required Devices    7730 8289 12785 Total area for devices  mm2   
19. e kept very low without loss of a very high degree of dependability   Meanwhile  a tremendous number of installed units will insure high volume production  and thus motivate increased costs during design  implementation  integration and test  phases  In particular  the system design and implementation should be paid special at   tention and a careful  dedicated design should comprise a fault tolerant hardware solution  as well as an application program development environment  From this example  we may  identify a highly interesting field for future research and development  A fault tolerant mi   croprocessor architecture dedicated for use in a safety critical time triggered hard real time  system     23    References     Adv88    Bri93      But93      Inm89    Int 83      Joh92      MIP87    Mor93     Mot90     Rom     Saa92        Tor90      Tor92      You82     Advanced Micro Devices  Am29000 streamlined instruction processor  1988     Bridal et  alt    Dacapo  A dependable distributed computer architecture for con   trol of applications with periodic operation  Technical Report 163  Laboratory for  Dependable Computing  Chalmers University of Technology    412 96 G  teborg   1993     Buttazzo G C  Di Natale M  Hartic  A real time kernel for robot control  Techni   cal report  ARTS Lab  Scuola Superiore S Anna  Via Carducci  40   56100 Pisa   Italy  1993     Inmos limited  Transputer databook  second edition  1989   Intel Corporation  80960KB programmer   s reference manua
20. h causes a BREAK trap  to occur  Control is transferred to the applicable system routine     In SPARC  software debugging is only supported by the means of general trap in   structions     T800 supports software debugging by a variety of instructions that affects status bits   When the processor Analyze  pin is taken high the processor will halt at a descheduling  point  Consequently the processor offers possibility to respond differently on interrupts de   pending on the processor   s current mode  T800 incorporate a timer  The implementation  directly supports the    occam    model of time  Each process can have its own indepen   dent timer which can be used for internal management or real time scheduling  Hardware  redundancy is acheived by the means of multiple transputer configurations     THOR has a built in real time clock to keep track of system time  Furthermore   each process has a Delay Register  causing interrupt after a specified delay  This provides  for an efficient implementation of a high level language  real time  delay function since  kernel software is released from polling a  delay queue    each time a scheduling is to be  performed  Also the TASK instructions implemented in THOR serves as support for  introducing the ADA task concept as constituting a process in a real time system  There  are instructions for scheduling and delaying tasks as well as performing    rendezvous     between tasks  THOR provides hardware selfcheck as well as an error detection
21. inish current instruction  does not apply to exception      2  Check interrupt priority level versus current processor level  i e whether the interrupt  should be serviced or not     3  Save enough processor status to be able to continue processing after the interrupt  has been serviced     Finishing the current instruction causes no significant delay provided that no possible  instruction  from the instruction set  may last for more than one  or a few cycles  This  is true for the studied processors  Processor activities are assigned priorities determined  by the type of activity  For example  reset handling has the highest priority and thus  cannot be interrupted  Interrupts are assigned priorities to predetermine the behaviour  when simultaneous events occur and to assure that no high priority processor activity may  be interrupted  The saved processor status required to restart an interrupted program is  determined by the activities required to service the interrupt  In general  the processor  does not save general register contents when servicing an interrupt  The interrupt handler  routine is responsible for saving and restoring register contents which might be altered by  the service routine     Beyond the described    general approach    to hardware interrupt handling both T800  and THOR provides extended use of the interrupt mechanism by a single process  The  T800 EventReg and EventAck pins provide an asynchronous handshake interface between  an external event and a
22. l  1988     Johansson Roger  Processor performance in real time systems  Technical Report  136L  Department of Computer Engineering  Chalmers University of Technology   5 412 96 Goteborg  1992     MIPS Computer Systems Inc  MIPS R2000 RISC architecture  1987     Morin Magnus  Predictable cyclic computations in autonomous systems  A com   putational model and implementation  Technical Report 352  Department of  Computer and Information Science  Link  pings University  5 581 83 Link  ping   1993     Motorola Inc  MC88100 RISC microprocessor user   s manual  second edition   1990     Rome Air Development Center  Griffiss AFB  NY 13441 5700  MIL HDBK 217E   Military Handbook  Reliability Predictions of Electronic Equipment     ROS90  ROSS technology  Inc  SPARC RISC user   s guide  1990     Saab Ericsson Space  Stack RISC microprocessor instruction set architecture for  prototype chip  1992     Torin Jan  Characterisation of microcomputers for embedded real time systems    directions and basic criteria  Technical Report 100  Department of Computer  Engineering  Chalmers University of Technology    412 96 Goteborg  1990     Torin Jan  Dependability in complex automotive systems  requirements direc   tions and drivers  Technical Report 128  Department of Computer Engineering   Chalmers University of Technology  5 412 96 G  teborg  1992     Young S J  Real Time Languages  Design and Development  Ellis Horwood   Chichester  1982     24    
23. lly time deterministic behaviour  we might exploit the systems functionality  and performance at compile time     1 3 Dependability    Hard real time systems are characterized by the fact that severe consequenses will result  if logical or timing correctness properties are not satisfied  They span many application  areas  avionics  undersea exploration  process control  robot systems  automotives just  to mention a few  While logical and timing correctness should be explored  or proven  if possible  during the design  implementation and test phases  actions must be taken  to handle run time failures that may arise from transient or permanent hardware errors   This is accomplished through fault tolerant hardware designs  Generally  we require a hard  real time system to be dependable in the sense that catastrophies should be avoided  thus  keeping the system in a safe state  The dependability requirements may be expressed as   Tor92      e degree of fault tolerance given as behavioural consequenses of faults  e g   fully   operational after one fault  FO   reduced operation after one fault  FR   safe op   eration after one fault  FS   For example  the dependability requirement FO FS  states that a system should be fully operational after one permanent hardware fault   regardless of which or where  and the system should remain in a safe state even if a  second fault occurs     e tolerable probability of failure that might cause the corresponding safety critical  hazard  For exa
24. mple  a system which at a fault might cause safety critical hazard  should  at the most  in one per million implemented systems cause one hazard per  year     Obviously  a dependable computer demands its own design philosophy where redundant  parts  high quality components  and careful manufacturing is of major importance     1 4 Scope    This report discusses the impact of hard real time requirements on microprocessor perfor   mance  Certain dependability aspects are also considered  although not covered in detail     1 5 Objectives    The primary objective with this report is to elaborate the microprocessor   s role in a hard  real time system  Therefore  we discuss hard real time systems and microprocessors from  an architectural point of view as well as system hardware design     The architectural considerations assume an event triggered hard real time system with  kernel software  Seven different processors were selected for architectural considerations   namely     e Motorola MC88100  Mot90    e Intel Iapx80960  Int88    e MIPS R2000  R3000   MIP87    e Cypress SPARC  ROS90    e Advanced Micro Devices Am29000  Adv88   e Inmos T800 transputer  Inm89     e Saab Ericsson Space THOR  Saa92     The hardware considerations treat a space qualified computer system  Rom  compared  to a general purpose application  using the three processors SPARC  T800 and THOR     1 6 Related work    A background to microprocessor architecture related to hard real time systems and method   olog
25. n internal process  When an external event  interrupt  pulls  EventReg active the external event channel  additional to the external link channels  is  made ready to communicate with a process  When both the event channel and the process  are ready the processor pulls HventAck active and the process  if waiting  is scheduled   Only one process may use the event channel at any given time  If no process requires an  event to occur EventAck will never be activated  If the process is a high priority one and no  other high priority process is running  the latency is typically 19 processor cycles  Setting  a high priority task to wait for an event input allows the user to interrupt a transputer  program running at low priority  The following functions take place     e Sample FventReq at pad and synchronize     e Edge detect the synchronized FventReq and form the interrupt request   e Sample interrupt vector for microcode ROM in the CPU     e Execute the interrupt routine for Event rather than the next instruction     As opposed to a more general interrupt handling approach  THOR gives hardware sup   port for synchronization between processes running on different processors  In THOR   normal executing may be preempted by an interrupt condition as well as an internal gen   erated exception or by exceptions raised by software  THOR s six input pins  reflected  in the Signal In Register  is regarded as different priority interrupt pins  Anyone turning  to an active state forces an inte
26. nd    3 3 Memory power consumtion    The memory used in the HDO configuration   64k nibble  Cypress CY7C194 is a 24 pin  device with 35 ns access time  Memory is organized as 40 bits words  32 data and 8 check  bits  thus each memory access will activate all of the ten devices     If we define the Average Memory Activity   AMA  as the fraction of processor cycles  that accesses memory in an instruction mix  the memory power consumtion could be  estimated as     Paverage   AMA Pactive    1   AMA  Pstandby    For this memory device   Pactive   650 mW    Pstandby   100 mW    Determination of AMA is complicated by several factors  The memory device needs  typically one cycle to enter standby mode after beeing accessed  Obviously  the memory  power requirement depends on the instruction execution order  If  for example  load store  instructions were ordered as every other instruction rather than consecutive instructions  then there would be more memory    active    cycles since we actually need two consecutive  cycles that do not access memory to reach the    standby    mode  In the estimations  the  instruction order as well as wait state cycles are ignored and AMA is considered a function    of     16    1  Instruction Fetch Rate  2  Instruction Mix    3  Instruction Execution Timing    Instruction Fetch Rate is limited by the instruction format  For example  with an  instruction format of 32 bits and assuming single cycle execution of all instructions every  cycle needs an inst
27. nd internal cache memories limit the possibility to thoroughly debug real time software  since the internal processor state may differ from one event of a certain kind to another  event of the same kind  It is also clear that a hard real time system implementation  that utilizes a high performance RISC CPU  such as the Cypress SPARC  does not  necessarily benefit from the high execution rate that the microprocessor offers     Strong dependability requirements imply the need for system predictability  By this we  mean that a faulty behavior that cannot be observed at compile time  through debugging  or other analysis tools  must not occur during run time  For such a design  a time triggered  real time system becomes an attractive solution since input  processing and output  are  performed essentially undisturbed by hardware interrupts and is thus time deterministic   The deterministic behaviour provide means for analyzing the application software for  logical errors as well as for time constraint violations  It might even be possible to develop  methods for proving the correctness of a given application     It is likely to believe that in the future hard real time systems will control systems that  further expose people to hazards emerging from computer failures  Automotives is  for  example  such an application  At the same time  however  such a new field dramatically  changes major presuppositions in hard real time system design  Manufacturing and main   tenance costs must b
28. ocesses was considered     Processor Freq  Total Time   MHz     mikro seconds   MC88100 12 2  I80960KB 21 4  Am29000 13 1    MIPSR2000 6 8  SPARC 17 2  T800 less than 1  THOR less than 1       Table 1  Total time required for a process switch  estimated     A complete process switch is assumed accomplished by  storing old process context    selecting a new process   load the new process context into processor registers  For THOR  and T800 there is hardware support for rescheduling  as described above  while for the  other processors  process switch was programmed  Table 1 summarises the results  Joh92      2 4 Real time system support    As stated earlier  a real time system should provide means for synchronization between  events  This requires data structures for wait and delay queues and a timer function used  to maintain system time and for process delay purposes  Another important issue is the  problem with synchronizing  local  system time with    global    time  i e different real time  systems in a distributed environment should be able to use this global time for different  purposes  Moreover  the system should provide an accurate delay time for processes that  require it  It should be clear that we are addressing an issue that is different from a  conventional real time clock in a work station application     Real time system software needs careful debugging and testing  Traditionally  pro   cessors give support for this through a    trace    instruction  i e by exe
29. processor cycles required to execute a load  store instruc   tion  For simplicity these are considered equal in this sense     e X4  denotes the number of processor cycles required for the execution of a floating  point instruction     In order to describe wait state penalties and different instruction formats the following  parameters were introduced     e W  denotes the number of wait states required for a read bus cycle  determined by  the system configuration     e U  denotes the averages number of instructions that becomes available for execution  as a result of one  32 8 bits  fetch  If  for example 70  of the instruction set consists  of instructions encoded in 16 bits and the rest are encoded in 32 bits  then     U 0 7 24 03 1 7    e Y W U  denotes the average number of cycles required to feed the processor with  one instruction  This is a function of wait state penalties and instruction format     1  W cycles    U instruction    Y       15    Since instruction fetch and execution is performed simultaneously in a pipe lined archi   tecture we write     Z     maz X1 Y W U    Zz   maz X2 Y W U    Z3   X3 W  Za   maz X4  Y W U      We obtain an expression for the Execution Rate Estimation  ERE        ERE   Za   Z2  2   2343   Z4r4 cycles     where ERE denotes the average number of cycles required to execute one instruction   Including the cycle time CT in seconds  we arrive at a final expression for the execution  rate     1 instructions    ER               R ERE CT seco
30. quires at least n memory accesses with possible  penalty and degraded performance  Thus  it is preferable to hold and pass the parameters  in registers  This requires a large number of registers  as well as conventions for the use  of these registers  The register usage conventions are specific to the different processor  architectures and will be described in the following     Besides parameter passing  a compiler generates specific code for each subprogram   This specific code is to be executed before the actual  translated high level subprogram   subprogram entry  as well as after the high level subprogram  subprogram exit   Subpro   gram entry code should  for example  allocate memory required for local variables  possibly  perform stack checking  and check pointers for valid memory accesses  Some high level  languages  such as ADA  support differentiated error handling  i e different subprograms  use different error handling routines for the same type of error  which will cause extra  overhead during run time  As examples of subprogram exit code we have deallocation of  local variables  placing return values at appropriate locations and error checking  In real   time systems  it often turns out that stack checking  memory access violation checking and  differentiated error handling must be discarded in favour of more dense code and faster  execution  However  during the debug phase of real time system software  these facilities  may be of great importance     The MC88100
31. rrupt condition  Upon receiving an interrupt  THOR ac   tivates a hardware scheduler  the interrupt priority which also may be regarded as a task  number  causes the scheduler to dispatch the corresponding task  This mechanism may  be used to synchronize tasks running under different microprocessors in a multiprocessor  environment  External events is thus rapidly gaining the microprocessors attention which  ensures a minimal interrupt latency time  THOR exception handling has adapted the  ADA language definition  To each fragment of code  or rather  each subprogram  there  exists an exception information block dynamically allocated and initialised before the  subprogram entrance  This provides for different exception processing in different subpro   grams of same type of exception  The strategy obviously decrease the overhead required  by a software kernel  When a hardware exception  which also can be raised by software   occurs the exception register is used  It points to an exception information block in the  stack  This block holds the program counter for the exception handler to call  and the  pointer to the next  outer scope  exception information block  When a hardware generated  exception is raised  the following actions occur     e Top of stack is set to the value of ER   e Stack top value  i e address of the exception handler is popped into PC     e Stack top value  now the new ER  is popped into ER     e The exception number is pushed  according to the preceding ta
32. ruction fetch  A shorter instruction format  i e more dense code  will  decrease the need for instruction fetches     The Instruction Miz is essential since  for example  load store instructions introduces  extra memory accesses  thus increasing AM A     Instruction Execution Timing affects memory activity since the fact that all instruc   tions do not execute in one cycle will reduce the need for instruction fetches  Thus the  higher execution times  the lower the AMA     Here  AMA is estimated by     1 T1 T2 T3 v4    AMA    4  2 4344         3 4 Notes on the failure rate estimations    Failure rate estimations was carried out according to  Rom   For temperature acceleration  factor calculation the thermal resistivity factor was used whenever it was available from  manufacturer   s documentation  However  since such information was rare  assumptions  had to be made about the junction temperature  For complex circuits  such as CPU s  and FPU s  a junction temperature of 110 degrees Celsius was assumed  For all others  a  junction temperature of 80 degrees Celsius was assumed     3 5 The HDO configurations    The HDO configuration is intended to characterise a space flight on board computer  It  consists of  CPU  256 kB of static random access memory  error detection and correction  circuitry  real time clock and glue logic  The designs uses only space qualified components  if nothing else is explicitly said  In the failure rate estimation for HDO configuration the  following 
33. ssible registers     A common method is to let the process stackpointer reside in the upper region of data  space  growing downwards   The stackpointer itself  upon a process switch  is stored in  the actual process PCB  That is  A minimum of operations performed to freeze a process  and maintain the ability to restart it at any later time for the operating system must be     1  Save the entire processor context by pushing it onto the stack     2  Store stackpointer value in the PCB     The process can be restarted simply by loading the stackpointer  from PCB  and  pulling processor context from the stack     For a complete process switch the old process must be preserved and a new process  must be selected and started  In a system with several runable processes  the operating  system must choose the one with the highest priority  There might for example be processes  waiting for IO  or processes waiting for synchronization with other processes in the system   In other words  Every process PCB has to be checked regarding the process status  runable  or not  and priority to pick the runable process with the highest priority  The effiency of  this activity is of major importance for a real time system where the overall function relies  on the systems ability to respond to external events and schedule an appropriate process     As an example of process switch in small real time systems a simple case was analyzed  for the studied processors  A real time system with ten runable pr
34. tood as an information processing system which has to respond to externally  generated input stimuli within a finite and specified period  You82   The functionality of  a real time system may be divided into three major parts     1  Get information  INPUT  as soon as it is available  2  Process information    3  Present result  OUTPUT  within the specified period    The time requirements laid upon real time systems impose a characteristic and an im   portant constraint  a correct result must be presented within a limited time  This time may  very well be a variable and thus dynamically impact on system behaviour  For example  consider the situation at a cross road guarded by traffic lights where the signals should  be optimized for a maximum throughput of vehicles  Another type of time requirement  is introduced in systems where the functionality depends on the system   s ability to meet  these requirements  For example consider a system that controls fuel ignition during a  rocket launch  Time requirements that must be met to insure proper system functionality  are called hard time requirements  A real time system that has to meet hard time require   ments is called a hard real time system  Hard real time systems are traditionally divided  in two major groups  event triggered systems and time triggered systems  This report is  based on an earlier study in which seven microprocessors ability to perform in an event  triggered real time system were elaborated and reported  Joh9
35. y for analysis can be found in  Joh92      Directions and basic criteria for microcomputers in embedded hard real time systems  is treated in  Tor90      Dependability in complex automotive systems are elaborated in  Tor92      A real time kernel for Robot control     HARTIC     is an attempt to meet hard real time  systems requirements at the software level  It is described by Butazzo Natale in  But93      A computational model for software in time triggered systems is described by Morin  in  Mor93      The FTCN  Fault Tolerant Computer Network Architecture  is a fault tolerant dis   tributed hard real time system described in  Bri93      2 Real time systems and microprocessor architecture    This chapter will discuss how the studied processors conform to common hard real time  requirements in their implementations as certain programming constructs  That includes  subprogram calls  interrupt handling  process switch  real time synchronization facilities  and debug support  Other aspects of high level language support are regarded as beyond  the scope of this work     2 1 Subprogram calls    A subprogram call is a result of a high level language function procedure call statement   In the case of a call func p1 p2     pn   the compilers function is to generate code  for a subprogram call with n parameters  The traditional way to do this is to push  the n parameters on stack and perform a subroutine  subprogram  call  then modify the  stackpointer and continue  However  this re
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
BV5000 User Handbook - BlueView Technologies, Inc.  Manual Porta vasos Casualpla  XX.Y Dynamic 10 User's Manual  Samsung P63FP Käyttöopas  リヤラダー デリカ D:5 (B230304A/B)  instrucciones de instalación para la estufa de fuel dual de 30  Manual de Uso  Safety Manager Software Reference  On-NetSurveillance NetDVMS Manual  User Manual - B&H Photo Video    Copyright © All rights reserved. 
   Failed to retrieve file