Home

31295004985387

1. ITN I I I 1 I I I I 1 I I IN I F I mO OI N f fl f l f lf j O I O I In IN IN J GN NN G mn I c I I I I I I I I SIO I I OB J jd Jin I N I O Il f O O I O I eae Di MUEVE P S E NEN tn N G I N II YN I t I I I I I I I I s o Se eo eee N I mm lI f Ilus I O IF and e5 Execution times in seconds 4 9 m c tm u c 9 o Q 9 2 gt o c E 42 Q gt e 1 be 0 e Qu E 00 2 pa c von gt i 4 0 4 32 w w 992 X c x U e Q 42 o CE Oo Q O O Q 09 vmacw mm Dow x vw O gt Q rO x Q E N lt q C O gt 3 0 e 0 2 e0 gt Q lt gt e 49 Note 100 200 500 400 500 600 700 800 900 1000 Figure 19 Execution Times e and Lock Loops L 63 CHAPTER VI SYSTEM EVALUATION AND SUGGESTIONS FCR FUTURE RESEARCH 6 1 Summary The major problems which exist in many current small scales multiple microprocessor systems using a common memo ry as
2. Z I I OIOIOIOIOIOIOIOIOIO IOIOIOIOIQOQOIOIOIOQOIOJIO N I me f lun t O l I O O O I I I I I I I LAE Yer QS uos bu 1 I I I I I I I 70 1051705108170 17717 I I I I I I I a s M S lt 1 s W s o Wv s seconds 0 E 7 Q 42 OW 0 Nw E C 2 om YD 42 x C Qu O Ou Coo O O KW 25 QUU l V x conc oo om m 3 o Qe Be 4 em m v w o NON Q amp gt E E E rN lt 227230 z 2 2 1 11600 4 onde 204 Note 57 TABLE 6 Ratios of Execution Times of a Single SBC with a Celta Configuration Five Outer Loops ew awe s awe lt s ew s g ew ae we s 1 I I Il O m I I I Im O O I OJ O gt e ef lt B NIN
3. dus I I I A fast a I I I I I I o a IN f I N IO Wi O lm I f I Ds gt a e O I N O I I I I I tate I I I I I I I I I INAN I u 16910 IG un jO JO IOO I SN BU C AA LAR Oe SP m u I O JOO Im I f I 9 I i I I 06 I I I I O O I O 2101001015168 51681 818 oiio xu I I I I I I I I wv 1 I I I I I AINIM m m ve iod m ena I I I I I I s s 1 ss e v L Number of main loops Note M Number of outer loops N Number of local processing loops in seconds C3 and c 5 Execution times AVG Average of execution times Cle C2 66 TABLE Execution Times of a Single Processor Ten Outer Loops Three Main Loops Qm Q Q we s s s q Q um m I f 40 I Q JJ 2 N I OO
4. OF OF e LXI I I I 1 I I I I I I I I I I lt I i t SNI m I kot dr kuq p ef J e j I Ol I IG 17 j f gu i u I I I I I I I I rin ist N Pun 0 F OF j e dub kas AS SP us tap Ko I I I I I I 1 I 1 I ICIO INI OIS DIENWINEIE Ir m JO I OI I mo ef j IO I 4 N N I mO l Ilu u t I I I I I I I I I I I I 160101 1065 160101 f wo Gjall 1 1 e IO 171741570 u u I I I I I I I I I I I I I I I 601 0 7018 28 139 85181 815 64 0 zZz O OI OOOI IO ond I I I M Number of outer loops Note processing loops a4 and a5 Execution times AVG Average of execution times N Number of local ale a2 in seconds 54 TABLE 3 in the Delta Execution Times of Three 53 5 Configuration Ten O
5. 1 2e 4 The system is hierarchical tree hardware structure which consists of an on line filters to select useful data and discard useless data and a second stage trigger to start working when data collected by the filter reach es Some critical level It is based on an MC68000 from Motorola Incorporated System software supports data acquisition and operating System modes The data acquisition mode provides the user with facilities to run time critical programs The operating system mode includes the kernel the minimum operating system and the extended operating system to su pervise the functions of different levels in the whole Systeme System interprocessor communication is handled by dual port memories between the slave processing cells and the supervisor through broadcasting write and wired or read operationse One of the dual ports is realized as a back panel connector to interface with the internal sys tem bus while the other is realized as a front panel connector to the external bus Each port has its own ad dress decoding PROM so that the same memory has different address spaces in the memory maps of the associated processing cells message center is provided and data exchange is permitted between slaves 5 Each processing cell consists of a CPU module and other System components such as RAM EPROM and an I O interface to the host data taking computer This mid size system is used to
6. implemented installed and used High expanding abilitye Subsystems can be added to an existing system with no increase in memory contention and the least interruption of operation course there are disadvantages of multiple processor systems we should consider other costs associat ed with multiple processor Systems such as complicated Software component boards and connectors which will reduce the performance cost ratio difficulty to recode a se quential program into a parallel one is another problem Also when we incrementally increase the number of proces sors in a multiple processor systems throughput increases at first then it becomes saturated This saturation or re duction in throughput occurs because an increase in the number of processors results in an increase in the amount of contention of the shared resources and overhead of informa tion exchange due to interprocessor communication However multiple microprocessor systems definitely will have many applications in the Future HORD85 HALL80O 1 2 Investigation of Current Systems Motorola Incorporated developed the MC68000 micropro cessor in 1979 De Gosman Le Hertzbergers and Ge Kieft implemented a multiple microprocessor system with several MC58000s in 1982 This is the only such systems which uses this particular microprocessor that has been reported publicly 605 82 The main features of this system are as follows
7. I IO I lt I mmn lun F I O I O l 1 0 I OO I 91 j oj 9 jJ gt I l l IO IO IG IO lt I m I O IO I lt I I 2 IG d N IG I m I I I I I I IOO O l IO m IO J O lI Ww 191010 m in l ate j e I 0105 N I N I I I I If lI lu I X lt mn zc qae dlc rep l ef e e 0910 1017051 O IO lI N jus I I I INEN Il I I I I I I mo O J O l l Q IO 10 O 1012010 l O IO UI e e j Im O Im Ol O j GG lu l OO I I I N Il I I ICO I OI lO jl O IJ O H l nt 2 mg 2505 O Il e e 10910 O N lun I OO N I N I I I I I IO Jm t mi es j 1 01 IO 1 O O O I Im I uA I 010791 D 9 9 1 ef Im 1 171 I I I 0177957 j N I I I I O O II O IO
8. I I I I I I 1 wama a e s s s v s v s M Number of outer loops Note a3 c4 a4 and c5 a5 Ratios of execution times AVG Average of ratiose sale c2 a2 c3 N Number of local processing loops cl 58 TABLE 7 Ratios of Execution Times of a Single SBC with a Delta Configuration Ten Outer Loops I I I IF rf I O IO ICO 1 0l O01 v v 1 ft JO IO IO JO lI O IO l O O O gt o ef NIENINENIENEN I 1 I I t I I I I I I I I I OO IP l O l IN Il CG I I m IF GIO I I O I C I m O O o et ef o 91 j c G lI IG G GN I I I 1 I I I I I I I I I INID O I O IO I O I W l G IF I O IO I O O O O a ef 91 91 j 91 Y I I I I I I I I I I I I I I I I I min INI xO IO l r l mO ce I O O DIM NN EAE MININ NINININ N NIN EN I I I I I I I I NN I amp J I G l O l I O IO I rn IC h Gf I UV O I O O
9. MULTIPLE 68000 MICROCOMPUTER SYSTEM by CHIEN CHI 5 0 8 5 5 5 COMPUTER SCIENCE Submitted to the Graduate Faculty of Texas Tech University in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE Approved August 1986 ae UT 5 In this thesis the design construction and evaluation of a multiple processor microcomputer system are presented The design logic of the system is discussed through a review of multiple processor systems which exist today Their problems are then surveyed flexible system consisting of modules of Single Board Computers SBCs and Dual Port Memo ries 5 is developed to try to solve the problems exist ing in the current systems Interprocessor communication hardware is constructed to perform the interfacing and tim ing requirements between control units and the dual port memories Benchmark programs are designed and coded in the MC68009 Assembly Language to evaluate the system li 5 ABSTRACT CHAPTER Ie INTRODUCTION lel Advantages of Multiple Microprocessor Systems 1 2 Investigation of Current Systems 1 3 Problems in Current Systems le Preview IIl SYSTEM CONFIGURATIONS 2 1 System Reconfiguration 2 2 Expansion Capaoility 2 3 The System Configuration Developed for this Research 224 Preview SYSTEM COMPOSITION The Control Unit The Dual Port Mem
10. 4 5 6 8 9 10 11 12 13 14 15 16 17 18 19 20 21 LIST OF FIGURES Basic Delta Configuration Pipeline Configuration Pyramid Configuration Delta Configuration Star Configuration Square Configuration System Topology Socket Pin Distribution 68000 and TMS9650 Connections Message Transfer in the Developed Delta System Flow Flow Flow Flow Flow Chart Chart Chart Chart Chart for for for for for An Application Processorl Processore2 Processor3 a Single Processor the DPM Subroutines of the System Designed Execution Times and and Ratio Execution Timese d and b and Ratio Execution Times e and Lock Loops L Generation of OE and WE Diagram of the Interface Circuit vi 15 16 17 18 20 21 26 37 33 42 45 46 41 48 49 50 59 60 63 77 13 CHAPTER I INTRODUCTION The speed of system components will not continue to in crease as fast as it has the past because according to the laws of physics the speed at which a digital computer can transfer units of information is approaching the upper bound of the electronic conducting capability of semiconduc tors However system performance requirements continue to growe order to increase system performances system ex pansion capability
11. 5040009 2 040008 0 040009 0 05 710904 0 06 42 5040008 040009 06 71 06 5001062 10000 02 0 03 1 03 03 02 5001062 09109E 0010D 1 05 05 04 001056 84 00109C 00109 0010 2 001004 0010 8 MOVES 3 8 5 TAS B BMI S BNE eS MOVE 8 RTS 5 3 TAS BMI eS BNE lt S 3 8 5 0 02000B i1 5020009 409 40008 40 5040009 00109C 40 02000B 5020009 5001096 0010C2 09 020008 415020009 40 2020008 41 5020009 00109E i0 04000B 040009 001004 0010 8 40 5040008 0 040009 001004 40 5040098 0 040009 85 PERMISSION TO COPY In presenting this thesis in partial fulfillment of the requirements for a master s degree at Texas Tech University I agree that the Library and my major department shall make it freely avail able for research purposes Permission to copy this thesis for Scholarly purposes may be granted by the Director of the Library or my major professor It is understood that any copying or publication of this thesis for financial gain shall not be allowed without my further written permission and that any user may be liable for copy right infringement Disagree Permission not granted Agree Permission granted p ZZ g J Student s signature Stud
12. 8 8 8 MOVEeL MOVE B MOVE 53 CMP B BNEeS 3 1 ADD L BGT L BSR S BSReS ADD BGTeS MOVE 3 82 40 05 10 55 402 5040008 O 50420009 2 579420008 705040009 0906 729 040008 2040009 06 71906 8001032 429585020008 1 020009 10000 02 0 03 41 03 03 02 5021052 00109 4001004 1905 05 04 5001056 70 020008 00109 00109 0010 2 001004 0010 8 8 5 MOVE eSB TAS 8 5 5 8 MOVEeS RTS MOVE eB 5 TAS B 5 BNEeS 5 RTS tle 020009 0 04000B 40 5040009 00109C 405020008 8020009 092109 50010 2 405020008 18020009 0 5020008 41 020009 00109 040008 040009 0010D4 0010F3 40 040008 40 040009 5001004 05040008 40 040009 83 550 3 001000 001042 001056 001062 3 MOVE eL 1 5 MOVEeL 0 856 5 BSReS ADDeL CMP L BGTeS O e 020008 0 5020009 70 040008 0
13. I O I O O ef 91 I 1 I I I I I I I I I t I un I uA I N J N I O I m u mO I OI I O IO O I O I O O 9 1 et 91 91 I N IG I N IG JG I GN I I I I I oT c T iOS Su Su S 2 O IOIOI IO re on as e N Il r Il wu O IF O I 1 I I I I I I I I I I lm I I I I I m lt lt 1 lt lt s s s W s a s q s s M Number of outer loops Note b5 Ratios of d4 b4 and 45 execution times AVG Average of ratios bl 42 02 43 N Number of local processing loops 41 17 16 15 14 13 12 11 N Figure 17 Execution Times m 3 4 5 6 7 8 9 10 11 10000 c and and Ratio c a 3 0 2 5 2 0 0 5 0 0 59 18 Execution Times d and b 10 11 10000 and Ratio 4 5 60 61 and SBCs on both sides of the DPM start trying to access the againe Overhead r
14. data and message exchange 4e Some processors are configured as consumers consume data and others as producers produce data Only the producers can write to the common memory so the function of this system is limited to very specific applicetions 5 A mailbox is used as the interprocessor communication scheme and interrupts are adopted to start communicating messages 6 The system bus supplies the address data and control lines that allow the various components to interact through an external arbitration unit 10 It is master slave structure in that bus master takes control of the bus and the other devices are prevented from gaining access to ite Message exchange is limited and communication speed slows down because task coopera tion between processors in the system must go through the executive 1 3 Problems in Current Systems The main problems concerned with the current multiple microprocessor systems are the followinge 1 3 The system organization of multiple microprocessor sys tems which exist today are limited to specific applica tionse When the application changes they become use less new system must be built resulting in a waste of resources implying additional financial invest mente They are mostly composed of 8 bit microprocessors which are slows when extensive memory accesses are required and hard to programe For scientific applications which need extensive arithmet
15. find or build a different OPM which has a much greater memory capacity ALEX84 BALP81 84 808079 802v82 CANN82 DAVI81 ENSL74 31 8 FLET80 BIBLIOGRAPHY Alexandridiss Nikitas Aes Microprocessor System ei Concepts 1984 Computer Science Press nce Baiphs Tom and John Blacks Multiprocessing Could Bring Out a System s Best Application of VERSAbus and VERSAbus Products 1981 Motorola Ince Baumgartner Pulse Fundamentals and Small Scale Digital Circuitss 1984 Reston Publishing Companys Inces Pages 460 510 Borovitse Israels Seev Performance Evaluation Companys Pages 9 26 Neumans Computer Systems 1979 0 Heath and Bozyigit and A Topology Reconfiguration Mechanism for Distributed Computer Systems 1982 The Computer Journal Vol 25 Nos 1 Pages 87 92 Cannon Les Fundamentals of Microcomputer Design 1982 Texas Instruments Learning Center Prioritized Individually Vectored Systems with Davis Rex Interrupts for Multiple Peripheral MC68000 1981 Motorola Ince Enslowe Philip He editor Multiprocessors end Parallel 1974 John Wiley Sons Ince Fathis Eli Te and Moshe Krieger Multiple Microprocessor Systems Whats Why and When 1983 March IEEEs Pages 23 32 Computer Systems Performance Prentice Hall Ince
16. in the next section 13 14 le Delta Configuration This consists of three 58 5 to form master master type multiple processor system Each 53 works independently and can communicate with two other SBCs through the dual port memories connecting theme Oelta configuration is shown in Figure 1 2 Pipeline Configuration In this configuration data en ters one end of the structure and is processed by each of the processors in the system Finally data exits the other end See Figure 2 2 2 Expansion Capability If we expand the system of three SBCs and three OPMs by adding one more SBC and three more OPMs we will have a ful ly connected system with four processing nodes and six branches forming a Pyramid structure This structure can be reconfigured into other structures by ignoring some nodes and or branches the system following explains the possible system configurations 1 Pyramid Configuration This is a symmetric system with high flexibility and reliability It is a reliable in terconnection scheme because it provides alternate paths when a direct path between two unit fails With its high reconfiguration ability this system can be used as a unit to form a large scale system which may include Figure 1 DPM DPM DPM The Basic Delta Configuration 15 16 uz Figure 2 The Pipeline Configuration hundreds of SBCse Pyramid configuration and the Del ta configuration gi
17. is employed to ensure System synchronization next chapter is dedicated to the hardware implementation of the interface board and sys tem connectione CHAPTER IV INTERPROCESSOR COMMUNICATION HARDWARE In this chapters we briefly describe the design and connection of the interface board design requirements for control lines the connection of address lines and data lines are mentioned Hardware wiring details are shown in Appendix for reference MC68000 Assembly Language Code used to test the functions of the system is shown Appendix B 41 The Design of the Interface Logic According to the functions described in the data sheet TEXA848 the TMS9650 chip is designed to control the in terface between two processors in a multiple processor sys teme Although the functions of the TMS9650 are close to what is needed in the design the signal specifications are quite different from those of the MC68000 Therefore we designed and built a logic circuit to interface the TMS9650 to the MC68000 following describes the related specifications of the MC68000 and the 7 59650 and the logic required to match these specifications 34 35 The R W signal from MC68000 is high when the MC68000 issues a bus cycle to read from memory This signal is driven low when the MC68000 is writing into memory corresponding signals needed by the 7 59650 for read and write operations are from two
18. perform calculations encountered in nuclear physics slave processing cells are arranged in a pipelined architecture with no direct in teraction between theme It is a special purpose multipro cessor system restricted to one applicatione Expansion of this system is possibles but the application is still limit ede For comparisons the following refers to two recent representative works in small scale multi microprocessor Systems Paul Russo 805577 designed a multi microcom puter system using the COSMAC microprocessor in 1977 The COSMAC microprocessor is implemented a single 40 C2L MOS LSI main features of the COSMAC are as fol lows le 8 bit multiplexed address bus 16 bit address register 2 8 bit data bus 8 bit data register 3 One byte instruction formate 4 No Multiply and Oivide instructions As far as extensive memory access is concerned this system is inherently slow because of its 8 bit structure The number of instructions is quite small because of its one byte instruction format multiplexed address bus complicates the hardware design Not using built in in structions for Multiply and Divide makes scientific applica tions slow and inconvenient The interface philosophy of this multi microprocessor is as follows 1 A master slave organization is used 2e No common memory is provided 3 Each CPU has sufficient RAM to handle its dedicated work load 4 T
19. processing power from the addi tional processors Recently some new requirements and complex applicationse which have an explicit parallel na ture of computations need additional processing power 581 85 the past an additional processing unit was either too expensive or was not powerful enough to meet these requirements With advances in microprocessor technology the cost of processing power can now be ignored when compared with the cost of other system partSe Therefore it is very attractive to use 3 additional processors in applications which require extensive computations such as data processing in weather prediction systems and target identification in radar Systems these situations no additional expensive peripheral devices are needed The performance cost ra tio increases when a multiple processor system is used High throughput System throughput increases as the num ber of processors in the system increases when a high throughput system is required it is easier to build a properly configured multiple processor system from off the shelf processors tnan it is to redesign and construct a new expensive processor with higher throughput High reliability Some specific environments and situ ations such as applications in hospitals space shut tles nuclear power plants and military equipment demand highly reliable processing power to meet their require mentse Most multiple microprocessor str
20. separate lines the Output Enable Low and the WE Write Enable Low Therefore the R W is ANDed and ORed with the outputs of D latch to produce the OE and the The higher order byte of the data bus from the MC68000 is connected to the data bus of the TMS9650 so that the ad dress line AO from the MC68000 is 1 when the OPM is ad dressed Three additional address lines 1 2 and which are connected to 50 51 and 52 on the TMS9650 chip are needed to address each of the eight registers in the TMS9650 The OTACK which is necessary for MC68000 asynchronous bus operations is generated using the READY output signal from the 7 59650 with the proper delay produced by a D latch The logic equation is DTACK 01 where 01 and Q4 are outputs from the Quad D latch 36 4e2 The Connection of the 7 596550 the 68090 On the there is area which is designed for system expansion and signal access from the outside world wire wrap area is provided to mount devices for port addition and memory units for memory extensione The MC68000 bus lines and system timing signals are accessed through an auxiliary I O designated J15 on the MEX68KECBe Two 20 pin sockets are mounted in this area See Figure 8 These sockets are connected to the TMS9650 in terface board with 20 wire flat cables The pins of the sockets are connected to a connection area designated 91
21. the message exchange medium are bus and memory con flicts this research a multiple microprocessor system a delta configuration has been designed and built this system interprocessor communication is performed through dual port memories which impose no bus conflicts and minimal memory ccnflicts on the system There are three 58 5 and three 5 in the system The MEX68KEC2 developed by the Motorola Ince is used as the functional unite and the 59650 from Texas Instrument Ince is used as the dual port memory to transfer messages between 53 5 A set of benchmark programs is used to measure the in crease in throughput which is the inverse of the execution tire of a job performed by a computer system and the amount of memory conflict which occurs in the system when extensive access is made to the 5 test results are shown in Chapter 5 in Tables 2 through 8 and are plotted in 6 65 Figures 16 through 18 These results are used in the next section to evaluate the system 6e2 Evaluation As shown in Tables 6 and 7 and Figures 16 and 17 the increase in throughput ranges from 2 15 to 2 94 lowest increase ratios 2 15 results from the fact that the smaller number of local processing loops means a relatively higher DPM access frequency This implies that there are more rel ative memory conflicts Thus the throughput ratio of the multiple processor system is lower when there is re
22. to 58 1 through 3 Thus S3C1 may use these changes to compare with current values input from its sen sors to see if the changes have been reflected in the tank Sara Lee 5 Deerfield Illinois has a multiple processor system with a common memory that performs a simi lar process control in its production procedures WEIN86 Je The multiple microprocessor system with DPMS developed this research if used in a plant like Sara Lee Bakeries could speed up these control processes by eliminating 5us contention and reducing memory conflicts This would produce more effective data acquisition and control signalss which should result in higher quality control and better products 5 3 The Test Results At the termination of each test each SBC reached a breakpointe All register contents including the final values of M and were displayed on the ACT 5A screen every case the system ran correctly producing the correct results order to compare the increased throughput of 52 Our System we emoedded an almost identical program in a main loop which executed it three times This enlarged pro gram was then run in a single SBC system For all tests we collected and compared elapsed execution times for various numbers of inner loops The results are shown in Tables 2 through 5 ratios of tne elapsed times for the two sys tems are shown in Tables 6 and 7 11 tabulated data is displayed graphicall
23. 7 which gives access to the MC68000 data address and control buses The output pins El and E2 from the address decoder of the MC68000 on the MEX68KEC3 which provide enable signals for unused areas cf the 68000 memory map are not included the connection area 917 They are connected to the sock ets separately DTACK signals are wired to pin from the sockets collector AND gates are used in the 59650 interface board for these special signals Power supply and ground lines Vcc and Vsse are drawn from the interior of the MEX68KECB to the sockets because they are not contained in the J17 area The details are shown in Figure 9 37 TMS9650 MC68000 Pin Pin 58000 TMS9650 Signals Signals Number Number Signals Signals 1 2 RESET 20 1 1 2 l I GND GND 19 2 1 50 00 00 18 3 A2 51 01 Cl 17 i 4 52 02 02 16 5 DTACK X 03 D3 15 6 05 X D 14 7 005 X 05 05 13 8 R W X 06 D6 12 9 AS X 07 07 11 10 8MHZCLK CLKIN X signals are connected to the interface board Figure 8 Socket Pin Distribution MC68000 E 8MHZ Figure 9 38 TMS9650 Ee 68000 Interface Logic INTERFACE LOGIC MC68000 and TMS9650 Connections 39 4 3 Preview An algorithm for performance evaluation of the system and the measurement of the execution time of test programs which are listed in Appendix B are prese
24. ANE kos N I I I i I I I I t NA UA Il UA O I un I m Il O I I Nin Trae J O PF IO O O O IO eo o fF of of o WI NIENENINIENENITNINITN EN Q I I I I I I I I I I I TIOIM F Jm l IG O mO 1 SQ Sas Sha SER kE s iS ss Q I I I I I I I I I I I iw Im lI O l OI o l IOO O O O et e e of ef j o mI rg J I I I I I I I I I I I t I I N I OQ I minim Ir I OF O Io I r i101 01 01 01 01 c01 01 0c 9 e ot of OF j o nin J J G N I I I I I I I I I 4 n I GN J N l a QO Il Y I P O lI OO O 9 91 OF e ININININININININININ Q I 1 I I I I I I I I I I I S ee lo FO 1 33 1 9 eb ng aat Sat bua ats Ez I I I i 1 I on I I I I I wyew
25. OCESSING LOOPS COUNTER C2 TO ZERO INCREMENT C2 BY 1 BREAK POINT Flow Chart for Processor3 41 43 SINGLE PROCESSOR SET THE NUMBER Q INITIALIZE THE PROCESSOR QUNIEH 10 SET THE NUMBER OF A ed IV INITIALIZE THE OUTERLOOPS COUNTER C1 TO ZERO SET THE NUMBER OF THE LOCAL PROCESSIN QOP QN INITIALIZE THE LOCAL PROCESSING QOPS COUNTER C2 TO ZERO u INCREMENT C2 BY 1 55 INCREMENT C1 BY 1 INCREMENT CO BY 1 YES No BREAK POINT Figure 14 Flow Chart for a Single Processor 43 DPM2 SUBROUTINE TEST AND SET THE SEMA YES NO NO SET THE MESSAGE TRANSFER FLAG TO 1 AND RELEASE THE DPM2 RETURN FROM DPM2 SUBROUTINE RELEASE THE DPM2 SUBROUTINE TEST AND SET THE SEMA THE CC N SET NO YES THE CC Z SET FLA Lx vic RETURN FROM DPM4 SUBROUTINE RELEASE THE DPM4 oOo v lt Figure 15 Flow Chart for the DPM Subroutines SET THE MESSAGE TRANSFER b Dp JU AND Figure 16 2 CHEMICAL TANK An Application of the System Designed 50 51 SBC3 takes messages from 0 2 calculates necessary changes in the process parameters and sends control signals to the digital analog convertors which has four sensors connected to it These sensors control valves to adjust the process parameters to some values changes made by SBC3 also sent
26. PENDIX THE HAROWARE DETAIL OF THE INTERFACE BOARD The splitting and combining of the signals from the MC68000 to produce signals which are needed to drive the TMS9650 are analyzed in the following 1 2 3 Activation of the Interface Board output signal E El or 2 from the address decoder of the MC68000 is used to clear and activate the first Quad O latch of the dual port interface board Generation of the Css Chip Select Low Signal low output Q of the first O latch is used as the CS signal to the 59650 and the high output Q is connected to the input of the second D latch of this Quad D latch Generation of the OE Cutput Enable Low and WE write Enable Low Signals outputs of the second O latch 2 Qy are combined with R W from MC68000 after one clock delay of 125 ns circuit is shown in Figure 20 The Quad D latchs are driven with 8 MHZ clock these combined signals are used as two separate enable signals OE Output Enable and WE Write Enable which are required the 7 59650 13 74 Selection of Two Paths The output Q of the second 0 latch is connected to two ANO Qates After being com bined with the QRed output and the NORed output of the address lines Al and A2 from the 68000 it is used to select two signal paths 1 2 Generation of the OTACK Data Transfer Acknowledge Low Signal One path is selected w
27. Pages 26 89 Ferraris Domenico Evaluations 1978 160 217 Fletchers William An Engineering Approach to Digital Designs 1980 McGraw Hill Company 6T GOSM82 GREB84 GRIN85 GROV82 HALL80 80 HARM85 HORDB85 JOHNB84 KART82 KNIG78 KOHAT8 68 Des Le 0 Hertzberger Kiefts The FAST Amsterdam Multiprocessor System Hardwares 1982 Febes IEEE Transactions on Nuclear Sciences Vol NS 29 1 Pages 314 318 Alan Bes Micro Linear Corporation Bipolar and MOS Analog Intejrated Circuit Design 1984 A Wiley Interscience Publications John Wiley amp Sons Pages 599 615 Jose M Mateo Valero Cortes Enrique Herrada 11110 and Jesus Labarta Analysis and Simulation of Multiplexed Single bus Networks with and without Bufferinge 1985 Mays IEEE Conference on Supercomputers s Pages 414 421 Groves Stan The Inter Relationshio between Access Time and Clock Rate in an MC68000 System 1982 Motorola Inc Halle Douglas Ves Aicroprocessor and Digital Systems 1980 McGraw Hille Ince HP64000 Logic Development Systems Hewlett Packard Company Colorado Springs Division 1980 Harman Thomas Le and Barbara Lawson The Motorola 68000 Microprocessor Family Assembly Language Interface Designs and System Designs 1985 Prentice Hall Inc Mich
28. QOIOIO Zz iIiOILIOIOIQOQOIQOQ OQIOIOIQOIOIGOQO J O O IO I I I I I I I I I I 4 I I I I s s v v v s ese seconds 53 54 and b5 Execution times AVG Average of execution times N Number of local processing loops b2 M Number of outer loops 51 Note 55 TABLE 4 Execution Times of a Single Processor Five Outer Loops and Three Main Loops I I I I mop prt O 0 nmin gt e j j j F lt I lu j O Il st Nd I I I I I I I I I I I O I Q I O O N O I u O f OI I f O u 0 m O O IO mi t I I I l 4 o I I I I I I I t I FL 9 91091 IP 1014010 I OO I f Ol df IGO ju IO I UI e l m I IO O IO I nA N 9 I 4 I I I I I I Im 01r 4110101401017 O aa o pe bed 1810 Q e
29. ael Design of Microprocessor Systems 1985 Reston Publishing Hordoski Sensor 6 Control Companys Ince Johnson James 8 and Steve Kassel The Multibus Design Guidebook Structures Architectures anc Applicationss 1984 McGraw Hill Ince 5 Svetlana ede Steven Kartashevy Designing and Programming Modern Computers and Systems Vol I LSI Modular Computer Systems 1982 Prentice Hall Ince Knights We and Je Bs Williams Dual Access Memory Architectures 1978 General Electric Company Kohavis Zvi Switching end Finite Automata Theory 1978 McGraw Book Companye v LEIB85 85 gt MAPL85 MARO82 MORR82 MOTO81 MOTO82 0 083 M0T0838 PATS81 RAO 82 RUHBB1 1 69 Leibowitz Burt John Carson Multiple Processor Systems for Real Time Applications 1985 Prentice Hall Inc MacGregors Doug Motorola Ince Jon Rubinstein Performance Analysis of MC68020 based Systems 1985 December IEEE MICRO Pages 50 71 1 Analyzing Software Performance in a Multiprocessor Environment 1985 July IEEE Transactions on SOFTWARE Pages 50 63 Marovace Nenade The Rotating Bus as a Basis for Interprocess Communication in Distributed Systems 1982 The Computer Journals Vole 25 1 Pages 22 31 Morrise Michael Fes and Paul F Roth Computer Perf
30. ata buffer between the data bus and static RAM data to be accessed is put in the Data Register before it is transferred to the RAM or to the 68000 registers Data Register address in the memory map of the MC68000 is 28 0 0009 where y is either 2 or 4 depending on which OPM is to be addressed The ad TMS9650 registers are listed in TABLE 1 The Addresses of the TMS9650 Regis Memory Map Address Lines Register Sel 1 hexadecimal Lines peu ou m P uu WENT 2 5 22 ce a Eu D ap a co x MEE 222 77 50 227770 02 esce ee a b Data Inc Register This has the Data Register except that when content of the corresponding Loc dresses of the Table 1 ters in the MC68000 ect Register l Selected I I sss xxu MEN SEE 1 Data Inc m exec l Control ems 1 Message In sid E 1 Status cd ee p l Data I emos T 1 ILocal Address See l 1 Message Out s 1 Remote Address same function as the it is accessed the al Address Pointer is d 9 f 3 h The vi 29 incremented by 1 to point to the next location in the RAM The Data Inc Register address is 0 0001 Message Cut Register This can be used to transfer one byte between the two 58 5 Its address is Oy 000D Message In Re
31. ating speed lower power consumption and more software support for multiple processor system applications than most other popu lar microprocessors used today The Motorola Educational Computer Boards MEX68KECB is a uniprocessor microcomputer system an SBC based on the MC68000 82 Other system components included in this 53 on board memory three I O interface chips two MC6850 ACIAs Asynchronous Communication Interface Adapters for serial interface with a terminal and host computers and MC68230 Parallel 24 25 Interface Timer for a parallel interface with a printer and or a cassette tape recorder on board memory in cludes two MCM68A364 ROMs for the system firmwares the TUTOR monitor the RESET routine vectors sixteen MCM4116 dynamic RAMs to store other exception handling routine vectors sys tem programs user programs and data MOTO81 4MOTOS82 je In this research three MEX68KECBs and three TMS9650 DPMs TEXA84B are assembled to implement the delta configu ration of the multiple processor structure described in Chapter 2 Each SBC nas ACT 5A terminal for its system console system topology is shown in Figure Each of the three MC68000s simultaneously executes TUTOR in its lo cal ROM when power is turned Test programs are down loaded to the local memory of each MEX68KECB from the Hew lett Packard 54000 Logic Development System HEWLBO and the proper starting add
32. ch processor checks the first byte ignoring bit which is the sign bit of this byte of DPM2 to see if it has a value of Os If yes the message which it previously passed has been received processor then writes into 2 notifying the successor 58 and sets its first byte to 1 ig noring bit 7 If no the processor enters waiting loop b Receiving Notification i Each processor checks the semaphore bits which is bit in the first byte of to see if DPM4 is available If yess it then goes ahead to access O PM4 Otherwise the processor enters waiting loop 44 ii Each processor checks the first byte of DPM4 to see if it has a value of 1 If yess the message which it expected has arrived processor then reads from DPM4 notification from the predeces sor and sets its first byte to value 0 ignoring bit 7 If the processor enters into wait ing loop Termination Each 58 exits from the test program reaches a break point at which final values of al regis ters are displayed on its ACT 5A screen The flowcharts of the algorithm are shown in Figure 11 through Figure 15 The test procedure simulates a real time system em ployed in a chemical plants with several sensors hooked to each SBC to speed up the data processing rate AS shown Figure 16 5 is responsible for processing the data com ing from an analog digital convertors which has four sen
33. e Analog Microprocessors Militarys 1976 Signetics Corporatione Silverman Gordon Avram Stundel l and John Lehman The Modular Multiprocessor A Model for Laboratory Instrument Design 1982 IEEE Micro Pages 51 62 Smith Kevins Senior Editor System Harnesses up to Eight 680005 Achieving 4 8 MPS 1982 Nove Engineering Notes Motorolas Inc Starnes Thomas Wer Design Philosophy Behind Motorola s MC68000 Part 1 to Part 3 1980 Motorolas Ince Stritters Edwards Tom Gunter A Microprocessor Architecture for a Changing world Motorola 680009 1979 Feber IEEE Microe Teddy Mike Stefano Crespi Reghizzis Antonio Natali Ada for Multi Microprocessors 1984 Cambridge University Presse Texas Instruments High speed CMOS Logic Data Books 19849 Texas Instruments Incorporated Texas Instruments Inces 59650 Multiprocessor Interface Manuals 1984 Texas Instruments Incorporated Toungs Hoo Min De and Amar Gupta An Architectural Comparison of Contemporary l6 bit Microprocessors 1981 IEEE Microe VIVE82 MEIN86 Tl P Ge Contes De Del Corso Fe Gregoretti r and Pasero The Micro Project An Experience with a Multimicroprocessor Systems 1982 IEEE Micro Pages 38 50 Weiner Le Private conversations Texas Tech University 1986 APPENDIX THE HAROWARE DETAIL OF THE INTERFACE BOARD 72 AP
34. ent s signatur Date e
35. esulting from memory conflicts was es timated from these execution times The data obtained from these measurement are listed in Table 8 and plotted in Fig ure 19 We analyze this data in the next chapter 5 4 Preview In the concluding chapters Chapter 6 a summary of this research project is presented first results obtained from execution of the test programs are analyzed and final lys future extensions of this research are discussed 62 TABLE 8 Execution Times and Lock Loops I I 1 I I I I o1o1orororwrIiny woe X o 2 te S5 dw cM qu p IO IO O I O IO I I I I I I I I I Q I f ll f If I u jJ us O ate gt j PM bus kus So I I I I I I I I I I I I I I fo j Ju IF n Q e jJ e l M I I I I 1 I I e 1 el PH AX p PAR Sa EE E ef 9 J OF j Tnt SS aS Shi qom as m I I I I I I I I I I I ee e rer tir lI P Sp s UT
36. gister byte transferred from the mote side of the Message Out register appears on the local side of the Message In Register The address is 0 0005 Local Address Pointer This specifies the RAM loca tion the MC68000 is going to access address is Oy000B Remote Address Pointer The Local Address Pointer is known as the Remote Address Pointer by the MC68000 on the other side of the OPM Control Register This is used to enable or disable the bits in the Status Register and has address 0y0003 Status Register This register indicates the inter rupt status of the TMS9650 and issues an interrupt signal to the MC68000 through the output pin INT on the TMS9650 chip 0 0007 is the address static RAM is accessed by both interfaces Registers It can only be accessed by one 3 30 microprocessor at a time two outputs of the arbitration latch ACTA and control access to the RAM If we need to write a message with values VAL into a location with value LOC in the we first write the value LOC into the Local Address Pointer then we write the value VAL into the Data Register the other hand the read operation is performed by writing a proper address into the Local Address Register and then reading from the Data Register The arbitration logic arbitration latch is respon Sible for granting RAM access to one port or the other to solve possible conten
37. he master CPU has sufficient RAM to buffer all informa tion to be exchanged between it and all the slave proces sorSe 5 Interprocessor communication can be either via pro grammed mode 0 or via Direct Memory Access OMA for block transfer 6 The master can interrupt any slaves but no slave can in terrupt the master processor 7 Information exchange between slave CPUs can only occur via the master 8 The master s DMA channel is time multiplexed between the various slave CPUs under its control master slave interprocessor communication scheme makes interaction between slave CPUs slow because their communication can only occur via the master There is a specific time delay in data transfer especially when the number of slaves increases since time multiplexed data transfer is usede Three engineers in the Laboratory of Electronics and Microprocessors at Rockefeller Universitys New Yorks New Yorke Gordon Silverman Avram Stundel and Jonn de Signed a multiprocessor based instrument in 1982 using a set of microprocessor controlled building blocks with a shared bus and common memory to handle their interaction 5SILV32 main features are 1 Each block is a Single 3oard Computer 58 built from an Intel 3085A 8 bit CPU and other system components 2e One of the processors was configured as the executive 3 The executive supervised a dual port RAM which served as tne common memory for global
38. hen Al or or both are 1 which means that the MC68000 needs to ac cess one of the TMS9650 registers other than the Oata or Data Inc register this case the signal pass es through third and fourth 0 latch The output of the fourth O latch is used as the OTACK to the MC68000 which is necessary for asynchronous 1 0 op eration of the MC68000 The AND gates used to com bine the 5 to output to the MC58000 should have open collector outputs since this signal is coing to be ANDed with the other OTACK s from the other pe ripheral devices in the system That s why the IC chip SN7T4LS7409 instead of SNT4LST7T408 is used the design Activation of the second Quad D latche when both Al and A2 are driven zero the MC68000 needs to access the Data or Data Inc register of TMS9650 The output Q of the second O latch of the first Quad O latch is 75 connected to the clear input of the second Quad D latch to clear and activate it Generation of the DTACK from the READY Signal The out put signal READY from the TMS9650 is connected to the in put of the first D latch of the second Quad O latch The four O latches are cascaded and the output 0 of the first D latch is ORed with the output Q of the fourth 0 latch to generate the OTACK to the 68000 truth table for this generation is shown in Table 9 and the logic equation is as follows OTACK 01 04 6 The Bus Error Requirement READY i
39. ic computations microprocessors with 16 bits or more of data bus and high computation capability must be used KART82 TOUNB81 They usually use common memory to manage data flow between processors which will result in bus and memory 11 access conflicts considerable software overhead is required to obtain system synchronization 124 Preview Many bus and memory conflicts are the major problems which exist in many current multiple microprocessor systems such as process control systems which use a common memory for message exchange the following chapters we explain how a system structure built around MC68000s as the control units using dual port memories which have no bus conflict and minimal memory conflict as the data exchange media and having a versatile configuration ability will help to solve these problems general overview of multiple processor Systems their advantages history and problems are con tained in this chapter Chapter 2 describes different con figurations and expansion capability of the system developed for this research The remainder of this thesis is devoted to details of this systeme functions of the control unit and the interprocessor communication medium are pre sented in Chapter 3 In Chapter the hardware details of the interprocessor communication unit the dual port memory are addressed In Chapter 5 we briefly describe an actual process con
40. is possible except that the value of is limited by the memory space addressable the address bus of the control unite For the system developed for this research and described in the remainder of this thesis we use a Motorola MEX68KECS as our single board computer the MEX68KEC8B there are seven connecting points from the address decoder which can be used to enable DPM DPM Figure 5 IA DPM DPM DPM DPM The Star Configuration 20 Figure 6 Square Configuration 21 22 user expanded memory units Thus eight SBCs are allowed in our structure and a maximum of twenty eight 5 could be included to perform message transfer between any two proces sorse 2 3 The System Configuration Developed for this Research The structure designed in this research does not have the complexity a general purpose multiple processor system with common memory usually has for examples our system in cludes neither control nor address buffers for each proces sors and no external arbitration logic for these buffers It does provide the capability of being configured into dif ferent special purpose multiple processor systemse A system for a particular application can easily be realized from this structure by plugging in the proper number of SBCs and DPMs with appropriate interconnections STAR80 MOTO82 je Three MC68000 based SBCs the MEX68KECBs from Motorola Incorporated and three TMS9650 based DPMs f
41. latively less local processing compared with accessese For large number of local processing loops the increased throughput ratios are around 2 9 See Table 7 which ap proach the theoretical maximum 3 0 for a multiple micro processor system with three CPUs Memory conflicts are measured using 10000 outer loops and one local processing loop with different numbers of lock loops Table 8 and Figure 18 When the number of lock loops increases from 100 to 500 the execution times are clustered around 244 seconds which is the minimum execution time for executing the test program without lockoute When the number of lock loops increases to 1000 the execution time of the test program increases to about 66 320 seconds 0 76 seconds seems to result from memory conflicts 6 3 Future Research A system with more SaCs and more 5 may be developed to provide more processing power and more capability of re configuration Software may be included in each local memo to implement dynamic reconfiguratione We may find or build a different SBC which will allow more than the present seven DPMs to be connected to it The memory capacity cf the TMS9650 chip used in this design is limited to two hundred and fifty six bytes If capacity of memory in a OPM is increased to allow more mes sages to be transferred between two processors the range of applications for this type of structure will be increased Therefore we need to
42. nted in the next chapter CHAPTER V TESTINS THE SYSTEM In this chapter we describe the algorithm and test procedure used to validate our system as an example of a real time process system employed in a chemical plant At the end of the chapter we interpret the results 5 1 The Test Procedure Since throughput of a system is an inverse function of that system s execution time we used elapsed job time to determine the increases if in throughput of our system over that of a single SBC systems test procedure con sisted of running nearly identical 005 in each SBC which executed a predefined number of loops After each comple tion of the inner loop the S3C notified its successor the SBC to its left in the Delta configuration Thus in ef fect each SBC was able to maintain a count of the number of notifications sent by its predecessor in the Celta al gorithm used in the test is shown b2lowe Following the al gorithm we discuss and display the test results 40 5 2 The Test Algorithm 41 The sequences of the procedures which the test programs take are as follows O Initializations There are three SBCs 58 1 SBC2 and 58 3 in the Delta test systems as shown in Figure 10 The system is started in three steps a SBC3 is started first to run the test programs It initializes the semaphores in its DPM2 and OPM4 to zero and then enters a waiting loops in which 58 3 keeps on p
43. olling a location in DPM4 to see if starting message has arrived from SBC24 If yes 58 3 goes to Procedure 1 D S8C2 is then started It initializes the semaphore in its to zero and then enters a waiting loop to see if a starting message has come from 58 1 If yess it sends a starting message to a specific loca tion in its 2 to start SBC3 and then goes to Pro cedure le c 5 is started laste It sends a starting message to a specific location in its DPM2 to start SBC2 and then goes to Procedure le le Each SBC executes a predefined number M of outer loops which repeat Procedure 2 to Procedure 5 42 y MESSAGE FLOW Known as DPM4 to SBC1 Known as DPM2 to 5 DPM DPM Known as DPM4 to SBC2 Known as DPM2 to SBC3 Known as DPM2 Known as DPM4 MESSAGE FLOW to SBC2 to SBC3 MESSAGE FLOW gt Figure 10 Message Transfer in the Oeveloped Delta System 43 2 Each SBC executes a predefined number of local pro 3 cessing loops The TAS instruction is executed before a SBC accesses a CPM to ensure two things operation sychronization and message validation Sending Notification i ii Each processor checks the semaphore bits which is bit 7 in the first byte of DPM2 to see if 2 is available If yess it then goes ahead to access DPM2 Otherwise the processor enters a waiting loop Ea
44. ormance Evaluations Tools and Techniques for Effective Analysis 1982 Van Nostrand Reinhold Companys Pages 714 133 Motorola Microprocessors Data Manual 1981 Motorola Inc MC68000 Educational Computer 3oard User s Manual 1982 January Motorola Ince 68000 l bit Microprocessors 1983 Motorola Ince M68000 16 32 bit Microprocessor Programmer s Reference Manual 1983 fourth edition Motorola Ince Patstone Walt l6 bit microprocessor Benchmarks An Update with 5 1981 September Pages 169 182 Raos Guthikonda Ves Microprocessor and Microcomputer Systems 1982 Van Nostrand Reinhold Company Ruhbergs David Le and Michael Multi processor Controller Using the MC6809E and the MC68120 1981 Microprocessor Application Engineeringe RUSSTT SANG85 SATY80 SIGN7T6 SILV82 SIMT82 STAR80 578179 TE0084 TEXA84A TEXA848 TOUN81 TO Russo Paul Interprocessor Communication for Multi microcomputer Systemse 1977 Aprils IEEE Transactions on Computers Pages 67 76 Sanguinetti John 3 Kumar Performance Message Based Multi processors 1985 Mays IEEE Conference on Supercomputers Pages 424 425 Satyanarayanansy Commercial Multiprocessing Systems 1980 IEEE Computers Pages 75 96 Signeticss DATA MANUAL Logics Memories Interfac
45. ory The Test And Set Instruction Preview DUN O UU IVe INTERPROCESSOR COMMUNICATION HARDWARE el The Design of the Interface Logic 4 2 Connection of the TMS9650 and the 68000 43 Preview TESTING THE SYSTEM The Test Procedure The Test Algorithm 1 2 e3 The Test Results e Preview lii ii 10 11 13 13 14 22 23 24 24 27 31 33 34 34 36 39 40 40 41 51 61 SYSTEM EVALUATICN ANO SUGGESTIONS FOR FUTURE RESEARCH 6el Summary 6 2 Evaluation 6e3 Future Research BIBLIOGRAPHY APPENDIX As THE HARDWARE DETAIL OF THE INTERFACE BOARD THE BENCHMARK PROGRAM iv 64 64 65 66 67 13 80 1 2e 3 4 5 8 9 LIST TABLES The Addresses of the 7 596502 Registers in the MC68000 Memory Map Execution Times of Three S8Cs in tne Delta Configuration Five Outer Loops Execution Times of Three SBCs in the Delta Configuration Ten Outer Loops Execution Times of a Single Processor Five Outer Loops and Three Main Loops Execution Times of a Single Processor Ten Outer Loops and Three Main Loops Ratios of Execution Times of a Single SBC with a Delta Configuration Five Outer Loops Ratios of Execution Times of a Single SBC with a Delta Configuration Ten Outer Loops Execution Times and Lock Loops Truth Table to Generate the DTACk 28 53 54 55 56 57 58 6 T6 is 2 3
46. re is no bus activity in this stepe 3 It issues a second bus cycle to write the data back to the same address in the This operation is indivisible in that the 5 is asserted throughout the three steps to allow synchronization of multiple processors the TAS instruction is used to lock 32 out a shared memory such as the from other processors when one processor has control of ite an 58000 ad dresses an operand using a TAS instructions the memory in which this operand is located is not available to any other processor until the instruction is completed The TAS has the symbolic form TAS EA where EA is the effective address of the operand and its definition is if EA O where EA is contents of then set Z 1 else set Z O if 7 1 where 2 7 is bit T of the content of EA then set N 1 else set N 0 set EA 7 2 Z and the Zero and Negative bits in the CCR Condition Code Register respectively The TAS instruction is a byte operation its operand is the byte pointed to by SA 1 33 The hardware implementation of this specific instruction in a processing unit provides the synchroniza tion of message transfers which is necessary in a multiple processor environment Je4 Preview In this systems we have used the MC68000 as the control unit and the TMS9650 as the dual port memory to implement a Delta structures TAS instruction
47. ree DC 77 SEN da E 5 47K 8MHZCK PESE R W UDS Figure 20 Generation of and WEF 1 DTACK m P o 8MHZCK LDS UDS R W Figure 21 Diagram of the Interface Circuit OE WE CS T8 APPENDIX THE BENCHMARK PROGRAM 79 APPENDIX 8 THE BENCHMARK PROGRAM The following code Systeme le MULTIPLE PROCESSORS a PROCESSORL ADDRESS 001000 00101C 001028 is the benchmark to evaluate the ASSEM3LY LANGUAGE MOVEeL 410 04 MOVEeL 40 05 MOVE 3 1 2 5020008 415020009 MOVceL 12000 02 0 03 ADDeL 41 03 03 02 8GT L 001028 858 5 001064 BSReS 00109A ADDeL 41 05 05 0 BGTeS 00101 MOVE 3 0 50200098 MOVced 15020009 80 001062 001064 001088 00109 0010 MOVE 3 MOVE eL BRAeS MOVE 3 5 8 5 5 8 8 RTS 5 5 5 5 5 MOVES RTS 040008 09 040009 001062 09 02000B 020009 001064 001088 20 020 08 41 5020009 0 020008 41 020009 001064 409 50409208 5040009 500109 02108E 40 0400CB 405040009 500109 0 5040008 70 5040009 81 b PROCESSOR2 001000 001056 001062
48. resses of the programs are assigned to each program counter TUTOR runs tne test program when the GO command is entered through the terminal For real world applications a more sophisticated operating system is needed to provide additional functions 26 64000 MEX68KECB ACT SA HP64000 HP64000 Figure 7 The System Topology 27 3 2 The Dual Port Memory A dual port memory is used as an interprocessor commu nication medium between each pair of processors in this mul tiple microprocessor systeme This type of structure intro duces no bus conflict because the dual port memory prevents the system bus of an individual processor from interacting with any other Each processor may gain control of the bus whenever a bus cycle is needed and access the common dual port RAM in the same manner as it accesses its local RAM Memory conflict is reduced to a minimum because only two processors contend for each and contention is usually resolved in the logic of the dual port memory TMS9650 IC chip consists of one 256 byte static RAM eight 8 bit registers control pins and arbitration logic These fea tures are briefly described in the following TEXA84B le The Registers the following register descriptions local refers to the port connected to tne SBC which is accessing the and remote refers to the port con nected to the SBC on the other side of the DPM a Data Register This acts as a d
49. rom Texas Instru ments Incorporated were included in this implementation These modules SBCs and 5 were connected into a delta or triangle type system when it was tested and evaluated The topology of this system is shown end discussed in the next chapter However these modules could connected in 23 different ways or modules might be added to or removed from this systems when application requirements change In applications which require frequent and quick reco nfigurations the system developed is an effective and eco nomical one because of its modular structure 2 4 Preview In the next chapter the MC68000 control unit and the TMS9650 dual port memory unit are discussed first The Test And Set instruction which is used in this multiple processor System implementation is then introduced CHAPTER III SYSTEM COMPOSITION The functions of the control unit the MC68000 and the configuration of the dual port memory the TMS9650 used in the system design are covered briefly in the beginning of this chapter diagram of the topology of the system and an explanation of the TAS instructions used for the system synchronization in the multiple processing environment of this implementation also appear in this chapter 3 1 The Control Unit The MC68000 MOTO83A is used as the control unit be cause it provides a longer more addressing modes an extensive and powerful instruction set higher oper
50. s used to syn chronize the TMS9650 and the MC amp 8000 when a message transfer through RAM in TMS9650 is needed If the RAM in the TMS9650 is accessed by the MC68000 from the remote side of the 59650 the 68000 on the local side of the TMS9650 is put into a wait state through the READY sig nale bus cycle won t cause any Bus Error until a peri od of 40ms has elapsed The period of 10 ms is about 40 clocks and is long enough for the MC68000 to perform regular memory access and release the dual port memory lock The diagram of the whole circuit is shown in Figure 21 76 TABLE 9 Truth Table to Generate the I I I I I 1 I I I 1 I I Q Q Q I I IS 10 RE Ef O O I I I I O I I I g 1 I I I I I I I I I I I I I I I I I I 1 I I I I I I I OG IO IO O IOP OI OI IPO m I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I gt I I lt I O I oo NE b 4 1 v eee ee ee s s Don t Ca
51. should be provided to allow additional processing power to be added without the need for the total redesign of existing systems We must investigate changes to classical system architecture so that system upgrading in the near future is possible ENSL74 Je In many control applications of computer systems reli ability and availability are very important features in that a failure can result in the damage of equipment and the ma terial processed and pose a threat to the safety of people working in the environment Therefore computer systems which have high performance expansion capability and reliability are needed JCHN84 BALP81 4 TEDD84 LEI885 A multiple processor microcomputer architecture is a major candidate to provide the increased performance reliability and availability required this chapter the advantages of multiple processor systems which match the demands of today s computing environment are described Several different designs and implementations of multiple processor systems developed in the near past and their problems are then discussed lel Advantages of Multiple Microprocessor Systems There are many advantages of using multiple micropro cessor systems The following lists some of these le High performance cost ratio performance of a multi ple processor system is higher than a single processor System using the same processing unit because of the ad ditional availability of
52. sors connected to ite These sensors are used to collect process parameters such as temperatures pressure air density and material volume in a chemical reaction tank 58 1 passes process results to SBC2 through 1 58 2 receives the results from and compares these results with previous data stored in its local database SBC2 then makes a proper decision and passes this decision to SBC3 through 2 45 PROCESSOR1 SET THE NUMBER OF _ THE OUTER LOOPS TO M INITIALIZE THE OUTERLOOPS _ COUNTER C1 TO ZERO HIG PROCESSOR2 TO START T THE NUMBER OF THE LOCAL PROCESSING LOOPS TO N INITIALIZE LOCAL PROCESSING L LOOPS COUNTER C2 TO ZERO INCREMENT C2 BY 1 YE N C2 E NO YE O UNLOCK BREAK POIN Figure 11 Flow Chart for Processorl 55 2 SET THE NUMBER THE OUTER LOOPS TO M INITIALIZE THE OUTER LOOPS QUNIEH Ci RO INITTALIZE DAMAS SEM INITIALIZE STARTING LOCK LTO BE LOCKED ET THE NUMBER OF THE LOCAL FO NGLOOP O IN INITIALIZE THE LOCAL PROCESSING LOOPS COUNTER C2 TO ZERO INCREMENT C2 BY 1 N gt C2 NO DPM2 SUBROUTINE DPM4 SUBROUTINE INCREMENT C1 BY 1 NO BREAK POI Figure 12 Flow Chart for Processor 46 Figure 13 SET THE NUMBER OF THE OUTER LOOPS TO M INITIALIZE THE OUTER LOOPS OUNTER C1 TO ZERO 4 DEP NO SET THE NUMBER OFTHE LOCAL PROCESSINGLOOPS TO N INITIALIZE THE LOCAL PR
53. tions between the MC68000s when Contention is resolved and access is given to one of the 680005 the READY signal is driven high to the corre sponding side of that 58000 so that the bus cycle may proceed The control lines a RESET s from 680005 are connected to and 2 which are the Mode pins on each side of a 59650 to select a proper operation mode of the TMS9650 OPM b AS Address Strobe Low from an 68000 is connected to LOCKIN tnrough the interface board c Es Enables from the address decoder LOS Lower Data Strobe Low and UDS Upper Data Strobe Low 31 are connected to CSs Chip Selects of the TMS9650 after passing through the interface board d The output signal line READY from the 59650 enters the interface circuit to generate the DTACK Data Transfer Acknowledge Low signal e R W Read or Write Low from MC68000 is used to generate Output Enable Low and WE Write En able Low to the TMS9650 3 3 The Test And Set Instruction The MC68000 assembly language includes the TAS instruc tion Test And Sets which uses a read modify write cycle to provide meaningful communication between processors in a multiple processor system 85 MC68000 executes this instruction in three steps le It issues a bus cycle to read data from a specific loca tion in the OPM into a register in the MC68000 2 It modifies the data in the register The
54. trol example and test programs which simulate a process control system Chapter 6 summarizes this thesis 12 project and evaluates the system designed using tables and figures obtained from execution of the test programs Sug gestions for future extensions are discussed at the end of the concluding chapter CHAPTER II SYSTEM CONFIGURATIONS In this chapter system structures which consist of Single Board Computers 58 5 and Dual Port Memories 5 and provide true reliability ease of design and reconfiguring ability are discussed These system structures are not restricted to any particular applicatione The modules in this structure can be easily rearranged to form new configurations for new applications 2el System Reconfiguration Depending on situations encountered there are two pos sible ways of reconfiguring systems containing 58 5 and DPMs dynamic and statice Dynamic reconfiguration is done during run time using software stored in the local memory of each SBCs This software reflects the number of SBC and DPM modules in the system and the connecting paths between these modules the other hand static reconfiguration is performed by physically connecting or disconnecting some SBCs and DPMs in the system before the system is turned on In this section we describe possible configurations of three 58 5 Configurations using more than three SBCs OPMs are discussed
55. uctures are re dundant in natures and reliability is improved by their duplicated hardware elements and software tasks failure in one processor is usually not fatal the system can keep on operating with some degree of degradation Thus multiple processor systems are more reliable than uniprocessor systems 4 5 High availability Most multiple microprocessor systems have high availability because of their longer mean time before failure and shorter time to repair compared to single processor systems These features are due to re dundancy and the fact that in most multiple microproces sor systems the processing units are similar or even identical to each other which means a faulty unit can be easily replaced by a spare In single processor systems the failure of a single IC chip could result in the breakdown of the wnole system Besides it is easier to locate a faulty module in a multiple processor system than to find a faulty component in a single processor System It is quite different in a single processor sys tem in that the whole system must be turned off if a faulty processor is to be replaced by a new one Concurrent system development and utilization multi ple processor system has a greater potential for use even when it is only partially built Even more the develop implementations and installation phases can be ov erlappede Once a functionally independent unit is devel it can
56. uter Loops I I I Ioff I O I f IO IPO IF I vote tif Iu IQ IP IO I O gt 4 9 910 lt N mn I I u INI D 159 o I I I I I nO I I I I I I IA O jJ O j O I Il X A m I m I f 1 IO I O D a Oe Il m jl lu i O IGIIF IO lI O I O I I I I I I I I I I I I I I I IT j F 14601010 f 1 f ISO IO 8 IPS l QN I l of 9 1 I GN I m I O 121010 I I I I I I I t I I Bord E Sq I u xO Q I O e ef 91 Io I I lt I P I C I I I I I n I I I I I IN I O l lt r Im OO AA l 1 l of 9 o lun I O Ir l O l O O I I I I I I I I I I I I us IT UA l us l en I us O e mm I I uA I WI QO I I NI 9 j j et tA IN I m I Iu I O GI l O I O I I I I I I IC I O1O O POE E O I OI I IOJOI Ot O IO O O OISI O O IOIOIOIOIOIOIOIO
57. ven previously are symmetric graph structures such that any one of the SBCs can communicate with all other S3Cs in the system and any of them can be assigned as a master Figure 3 shows the Pyramid config uratione Oelta Configuration Pyramid can be reconfigured into a Delta Structure by disconnecting at the points indicated by asterisks in Figure Square Configuration The Pyramid can be reconfigured into a Square Structure by disconnecting at the points Figure 3 The Pyramid Configuration 17 Figure 4 The Delta Confiquration 18 19 indicated by asterisks in Figure 5 Note Delta and the Square Configurations are examples of Ring Struc tures 4 Star Configuration Pyramid structure degenerates to a Star structure if the three outer dual port memories are actually taken away or are ignored by the software in real time Star configuration is a master slave sys tem and is useful in redundant processing with the cen tral SBC acting as a checking processor ENSL74 802 82 VIVE82 Star Structures reconfigured from the Pyra mid Structure by disconnecting at the points indicated by asterisks is shown in Figure 6 5 Pipeline Configuration We can also reconfigure the Pyr amid into a Pipeline of three or four 58 5 and two or three DPMs by making appropriate disconnections in the Pyramide The expansion of this structure to include N SBCs with a maximum of 1 2 N N 1 OPMs
58. y in Figures 17 and 13 The elapsed times were measured with a hand held stoo watch Five runs were made for each value of the number of inner loops averages of elapsed times as a function are plotted in Figures 17 and 18 As the figures clearly shows the elapsed times are nearly perfect linear functions of We calculated tabulated and plotted the ratios of elapsed times for our system and for the single SBC system The ratios improved as N increased and seem to be asymptotic to the value 3 which is the theoretical maxi mum increase in throughput fcr a multiple processor system of three SaCs Execution times of the test programs with ten thousand outer loops and only one local processing loop were then measured as a function of lock 5 which is the number of times an SEC will try to access a dual port memory before it gives Then the semaphore bit in the DPM is set to zero 53 TABLE 2 the Delta Execution Times of Three SBCs Configuration Five Outer Loops v v 1 lt gt SD s m m s s q Q s M j I I I I I INIMII m j P O N I V m O Im O rt O Dos 2 e lt O e AT Q J mnm ux I I I I I 1 I I I TUE WEDEMIDIMI MIT TI ny NWy nie D e

Download Pdf Manuals

Related Search

31295004985387

31295004985387

Contents

Download Pdf Manuals

Related Search

Related Contents