Home

Sun Fire™ 6800/4810/4800/3800 Systems Overview

image

Contents

1. Partition A Partition B SIS Seat wee LASE a e Board 0 Board 4 Board 5 Domain 0 ______Domain0 _______ Partition A Partition B A partition can have up to two domains Partition A Partition B sa sei i U Sa s Ra e aaa Se r NA Pal n a SL da a aa Ss ces 27 et 1 el To TS Board 0 Board 1 Board 2 Board 3 Board 4 Board 5 Domain 0 Domaini oaaaaa Domain 0 _______ Partition A Partition B m Active domain connection Active domain connection Active domain connection Inactive logical connection FIGURE 2 1 Partitions and Domains on a Sun Fire 6800 System Partitions A single physical Sun Fire 6800 system can be divided into two partitions All connections between boards of one partition and boards of the other partition are disabled The system logically behaves as two separate systems If the partitions are assigned to the physical half of the Sun Fire 6800 system then the power planes associated with each partition are also isolated A Sun Fire 6800 system can be divided into two partitions by logically isolating one set of Repeater boards for each partition The Sun Fire 4810 4800 3800 systems also support two partitions 14 Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 Each partition on the Sun Fire 6800 system can have up to two domains allowing for up to four domains total For the Sun Fire 4810
2. SA sta KS SK Le do Oe x LS Se Ro Xo RS Va x LL REL SRO RRLLI SREY x COM XP NG 9 RS ee oo TT J iy e e YT Se Q e Sol e e e e E i SI E sos fese e o 000 D CJ DO e DD in EOC TS H FIGURE 1 2 Sun Fire 6800 System Cabinet Front and Rear Views 5 Sun Fire Product Overview Chapter 1 6 Sun Fire 4810 System The Sun Fire 4810 system has support for three CPU Memory boards two I O assemblies two Repeater boards and two System Controller boards FIGURE 1 3 shows front and rear views of the Sun Fire 4810 system mounted in an optional Sun Fire cabinet TABLE 1 3 lists the features of the Sun Fire 4810 system TABLE 1 3 Sun Fire 4810 System Features Features Quantity or Description CPU Memory boards 3 CPUs 12 Maximum memory 96 Gbytes I O asse
3. Graphical User Interface was developed by Sun Microsystems Inc for its users and licensees Sun acknowledges the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry Sun holds a non exclusive license from Xerox to the Xerox Graphical User Interface which license also covers Sun s licensees who implement OPEN LOOK GUIs and otherwise comply with Sun s written license agreements Federal Acquisitions Commercial Software Government Users Subject to Standard License Terms and Conditions DOCUMENTATION IS PROVIDED AS IS AND ALL EXPRESS OR IMPLIED CONDITIONS REPRESENTATIONS AND WARRANTIES INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY FITNESS FOR A PARTICULAR PURPOSE OR NON INFRINGEMENT ARE DISCLAIMED EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID Copyright 2001 Sun Microsystems Inc 901 San Antonio Road Palo Alto CA 94303 Etats Unis Tous droits r serv s Ce produit ou document est prot g par un copyright et distribu avec des licences qui en restreignent l utilisation la copie la distribution et la d compilation Aucune partie de ce produit ou document ne peut tre reproduite sous aucune forme par quelque moyen que ce soit sans l autorisation pr alable et crite de Sun et de ses bailleurs de licence s il y en a Le logiciel d tenu par des tiers et qui comprend la technologie relative aux polices de
4. There are several capabilities that enable service to be performed without forcing scheduled downtime Failing components are identified in the failure logs in such a way that the field replaceable unit is clearly identified All boards and power supplies in a properly configured system can be removed and replaced during system operation without scheduled downtime Mechanical Serviceability Connectors are keyed so that boards cannot be installed upside down Special tools are not required to access the inside of the system This is because all voltages within the cabinet are considered extra low voltages ELVs as defined by applicable safety agencies No jumpers are required for configuration of the Sun Fire system This makes for a much easier installation of new and or upgraded system components There are no slot dependencies other than the special slots required for the System Controller and Repeater boards The Sun Fire system cooling system design includes capabilities that provide strength in the area of RAS Standard proven parts and components are used wherever possible Field replaceable units FRUs and subassemblies are designed for quick and easy replacement with minimal use of tools required Bulk Power Supply Removal and Replacement Bulk 56 volt power supplies can be hot swapped with no interruption to the system This assumes that the system is configured from the factory for power supply redundancy 22 Sun Fire 6800 481
5. Fire system contains a number of subsystems that are capable of recovering from errors without failing Subsystems that have a large number of connections have greater odds of failure The subsystems that have the highest probability of errors are protected from transient errors through the use of single bit error correction that uses an error correcting code Error Correcting Code Protection of the Data Interconnect The entire data path from the local data crossbars and the memory subsystem is error correcting code protected Single bit data errors detected in these subsystems are corrected by the receiving UltraSPARC III module and the system is notified for logging purposes that an error has occurred The memory subsystem does not check or correct errors but provides the extra storage bits The Sun Fire data buffer chips use the error correcting codes to assist in fault isolation If a correctable error is detected by the interconnect the system controller is notified and enough information is saved to isolate the failure to a single net within the interconnect system The data containing the error is sent through the interconnect unchanged and the error is reported 16 Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 Memory errors are logged by software so that defective DIMMSs can be identified and replaced during scheduled maintenance Detecting Uncorrectable Errors Almost all internal system paths are protected by
6. System Controller board SCO fails the redundant System Controller board SC1 takes over operations without causing a disruption of the system operation Virtual Domain Key Switches The system controller provides a virtual key switch for each domain The key switch command controls the position of the virtual key switch for each domain 32 Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 Solaris Console The system controller provides a Solaris software console for each domain The Ethernet or serial port of the System Controller board is the physical connection for the Solaris software console The serial port can support only one console at a time However the Ethernet port can support many consoles simultaneously The system controller multiplexes these physical connections to provide console services for each domain and for the system controller Virtual Time Of Day The Sun Fire TOD NVRAM chip is located on the System Controller board The system controller multiplexes the physical TOD chip to provide TOD services for each domain and for the System Controller board The system controller also provides for synchronizing the TOD between the main System Controller board and the redundant System Controller board Environmental Monitoring The Sun Fire system has a large number of sensors that monitor temperature voltage and current The system controller polls these devices periodically If thresholds are exceeded the s
7. for CompactPCI supports a total of six slots Repeater Board The Sun Fire 6800 4810 4800 systems are designed to be repaired and upgraded easier and faster than previous systems This is due to the placement of active ASICS mounted on the Repeater boards With two Repeater boards installed in the system an alternate path is available through the second board if one board fails The Sun Fire 3800 system is the only system that comes with all of the active components of the Repeater board built onto the centerplane The Repeater boards provide two functions redundancy for reliability and a higher bandwidth The system can operate with only one Repeater board The Repeater board acts as a switch and connects multiple CPU Memory and I O boards together The three components are the Address Repeater AR the Sun Fire data controller SDC and the data crossbar DX In standard operation the Sun Fire 6800 system has four Repeater boards which are used to route ten buses six CPU and four I O If one of the Repeater boards fail the system can continue to operate in a degraded mode with one pair of adjacent Repeater boards The data width is cut in half and the two Repeater boards will route the ten buses Because the Sun Fire 4810 and 4800 systems support only two Repeater boards the two Repeater boards operate together to route five buses three CPU and two I O If one of the Repeater boards fail the data width is cut in half and one
8. 0 4800 3800 Systems Overview April 2001 Fan Tray Removal and Replacement If a fan fails the remaining working fans are set to high speed operation by the system controller in order to compensate for the reduced airflow The system is designed to operate normally under these conditions until the failed fan assembly is serviced The fan trays can be hot swapped with no interruption to the system Domain Isolation The Sun Fire system has an interconnect domain facility that enables the system boards to be assigned to separate software systems For example one domain can do production while a second domain experimentally runs the next revision of the operating system or exercises a suspected bad board with production type work Nonconcurrent Service Nonconcurrent service requires the entire system to be powered off Remote Service Every System Controller board has remote access capability that enables remote login to the system controller Through this remote connection all system controller diagnostics are accessible You can run diagnostics remotely or locally on deconfigured system boards while the operating system is running on the other system boards Chapter 2 System Features and Capabilities 23 24 Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 CHAPTER 3 Hardware Overview The Sun Fire systems are a family of symmetrical shared memory multiprocessors SMPs You can view the Sun Fire systems at severa
9. 0 systems share many of the same components These components are the CPU Memory boards the I O assemblies the Repeater boards and the System Controller boards The Sun Fire 3800 system also shares the CPU Memory boards but uses different System Controller boards and I O assemblies CPU Memory Board The CPU Memory board is the same across the Sun Fire 6800 4810 4800 3800 systems This board supports up to four UltraSPARC III CPU modules eight Ecache SIMMs and eight banks of memory two bank per CPU with four DIMM sockets per bank for a total of 32 DIMMs AIl DIMMs must be the same capacity and sizes within a bank and must not be intermixed on a board All CPUs in the system must be the same clock speed for optimum performance Mixed speeds are supported but not at the board level All CPUs on the board must be the same speed The system will operate all CPUs at the lowest CPU speed in the system I O Assemblies The Sun Fire 6800 4810 4800 systems support only PCI I O devices The Sun Fire 3800 system supports only CompactPCI I O devices 29 PCI I O The I O assemblies are logically and physically the same for the Sun Fire 6800 4810 4800 systems The basic PCI I O assembly will have six slots for standard PCI 33 MHz device boards plus two slots for PCI 66 66 MHz device boards CompactPCI I O The CompactPCI I O assembly is designed for the CompactPCI form factor device boards The Sun Fire 3800 system I O assembly
10. 1 CHAPTER 2 System Features and Capabilities The Sun Fire family of servers has features and capabilities that were not available on previous midrange systems Primary features include the ability to partition your system and create domains These features provide greater reliability availability and serviceability which means uptime These features and capabilities are as follows m Partition The ability for the system to logically behave as two separate systems m Domain The ability to create logically independent multiple sections within a partition with each domain running its own operating system m Reliability A function of the care with which the hardware and software design was executed the quality of the components selected and the quality of the manufacturing process for example ESD protection clean rooms and so forth m Availability The percentage of time the customer s system is able to do productive work m Serviceability The system ensures that repair time downtime is kept to a minimum Partitions and Domains The Sun Fire system can be divided into partitions and domains A single physical system can have multiple independent logical systems each running its own operating system by using partitions and domains Partitions and domains differ only in terms of their flexibility and isolation Partitions A and B have separate Repeater boards
11. 3800 Systems Overview April 2001 Preface This document provides the following information about the Sun Fire system family of servers Machine configurations of the four servers Hardware overview System components Reliability availability and serviceability features xi Typographic Conventions Typeface or Symbol AaBbCc123 AaBbCc123 AaBbCc123 Meaning The names of commands files and directories on screen computer output What you type when contrasted with on screen computer output Book titles new words or terms words to be emphasized Command line variable replace with a real name or value Examples Edit your login file Use 1s a to list all files 9 5 su Password Read Chapter 6 in the User s Guide These are called class options You must be root to do this To delete a file type rm filename Related Documentation Application Title Part Number Installation Sun Fire 6800 System Installation Guide 805 7375 Sun Fire 4810 4800 3800 Systems Installation Guide 805 7370 Sun Fire 4810 4800 3800 Systems Cabinet Mounting Guide 806 6781 Operation Sun Fire Cabinet Installation and Reference Guide 806 2942 Sun Fire 6800 System Getting Started 805 7374 Sun Fire 4810 4800 3800 Systems Getting Started 805 7369 Sun Fire 6800 4810 4800 3800 Systems Service Manual 805 7363 Software Sun Fire 6800 4810 4800 3800 Systems Platform Administration Manual 805 7373 Sun Fire 6800
12. 4800 3800 systems if a single partition is established it can support two domains if two partitions are established however each partition will support only one domain Domains The Sun Fire system can be logically divided into multiple domains Since each domain is comprised of one or more system boards a domain can have between one and 24 processors Each domain runs its own instance of the operating system and has its own peripherals and network connections You can configure domains without interrupting the operation of other domains on the same system Domains can be used for m Testing new applications m Operating system updates m Configuring several domains to support separate departments While production work continues on the remaining and usually larger domain there will not be any adverse interaction between any of the domains You can gain confidence in the correctness of applications without disturbing production work When the testing work is complete the system can be rejoined logically without rebooting there are no physical changes when you use domains Thus if problems occur the rest of your system is not affected The Sun Fire 6800 system can have up to four domains The Sun Fire 4810 4800 3800 systems can each have up to two domains Each instance of the Solaris Operating Environment runs in its own domain Domains do not depend on each other and do not interact with each other A single partition on a Sun
13. 4810 4800 3800 System Controller Command Reference Manual 805 7372 xii Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 Accessing Sun Documentation Online The docs sun com web site enables you to access a select group of Sun technical documentation on the Web You can browse the docs sun com archive or search for a specific book title or subject at http docs sun com Ordering Sun Documentation Fatbrain com an Internet professional bookstore stocks select product documentation from Sun Microsystems Inc For a list of documents and how to order them visit the Sun Documentation Center on Fatbrain com at http www fatbrain com documentation sun Sun Welcomes Your Comments Sun is interested in improving its documentation and welcomes your comments and suggestions You can email your comments to Sun at docfeedback sun com Please include the part number 805 7362 11 of your document in the subject line of your email Preface xiii xiv Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 CHAPTER 1 Sun Fire Product Overview This chapter discusses the features and capacity of four Sun Fire servers m Sun Fire 6800 system m Sun Fire 4810 system m Sun Fire 4800 system m Sun Fire 3800 system This family of servers provides entry level to high end server functionality The Sun Fire 6800 system has space for internal peripherals mounted within the 19 inch cabinet You h
14. 9061 o 680808020 G 2 S S e ta S is Di S Li e SS 55 S si DI Di S d G 328989808 8898989808 20000 980998980898 19901 262080308080 220080 99995 220000000000 G 5 9 gt aG o o o ol E Piele 2 8 5 A Lollo E IA A o hi OTE A cu P 58 C E o 2 EI EI o o D 1000000000000000000000000000000000000000000000000000000000 G o000000000000000000000000000000000000000000000000000000000 o o o o o o o FIGURE 1 4 Sun Fire 4800 System Front and Rear Views Chapter 1 Sun Fire Product Overview 9 10 Sun Fire 3800 System The Sun Fire 3800 system supports two CPU Memory board slots two I O assemblies and two System Controller boards The Sun Fire 3800 system is different from the Sun Fire 6800 4810 4800 systems in that it does not have Repeater boards and uses a different System Controller board and I O assemblies FIGURE 1 5 shows front and rear views of the Sun Fire 3800 system mounted in an optional Sun Fire cabinet TABLE 1 5 lists the features of the Sun Fire 3800 base system TABLE 1 5 Sun Fi
15. EST FOURNIE EN L ETAT ET AUCUNE GARANTIE EXPRESSE OU IMPLICITE N EST ACCORDEE Y COMPRIS DES GARANTIES CONCERNANT LA VALEUR MARCHANDE L APTITUDE DE LA PUBLICATION A REPONDRE A UNE UTILISATION PARTICULIERE OU LE FAIT QU ELLE NE SOIT PAS CONTREFAISANTE DE PRODUIT DE TIERS CE DENI DE GARANTIE NE S APPLIQUERAIT PAS DANS LA MESURE OU IL SERAIT TENU JURIDIQUEMENT NUL ET NON AVENU Adobe PostScript Contents Preface xi Sun Fire Product Overview 1 Standard Features 1 Machine Configurations 4 Sun Fire 6800 System 4 Sun Fire 4810 System 6 Sun Fire 4800 System 8 Sun Fire 3800 System 10 System Features and Capabilities 13 Partitions and Domains 13 Partitions 14 Domains 15 Reliability 15 Reducing the Probability of Error 16 Correcting Errors Using Error Correcting Code 16 Error Correcting Code Protection of the Data Interconnect 16 Detecting Uncorrectable Errors 17 Multiple Bit Data Errors 17 Multiple Bit Address Errors 17 System Time Out Errors 17 Power Corrected Failures 17 Environmental Sensing 18 Temperature 18 Power Subsystem 18 Availability 18 Availability Goals for the Sun Fire System 18 High Availability Capabilities of the Sun Fire System 19 Cooling 19 AC Power Supply 20 ECC 20 Resiliency Capabilities 20 DC Power 20 Logic Boards 20 Processor 21 Memory 21 Redundant Components 21 Serviceability Capabilities 22 Mechanical Serviceability 22 Bulk Power Supply Removal and Replacement 22 Fan Tray Removal and Repla
16. Fire 6800 system can be divided into two domains Unlike partitions domains share the Repeater boards Each domain gets half the address bandwidth of a full system bus Reliability The reliability capabilities of the Sun Fire fall into four categories Reducing the probability of errors Correcting errors using error correcting code ECC Detecting uncorrectable errors Environmental sensing Chapter 2 System Features and Capabilities 15 Reducing the Probability of Error All the ASICS are designed for worst case temperature voltage frequency and airflow combinations The high level of logic integration in the ASICS reduces component and interconnect count A distributed power system improves power supply performance and reliability Extensive self test upon power on reboot after a hardware failure screens all of the key logic blocks in the Sun Fire m Built in self test logic in all the ASICs m The power on self test POST controlled from the System Controller board tests each logic block first in isolation and then with progressively more of the system Failing components are electrically isolated from the centerplane The result is that the system is booted only with logic blocks that have passed this self test and that must operate without error All I O cables have a positive lock mechanism and a strain relief support to prevent accidental disconnections Correcting Errors Using Error Correcting Code The Sun
17. Repeater board can route the five buses 30 Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 System Controller Board The System Controller board contains the system clock and a service processor One System Controller board is required per system You can install one additional System Controller board for redundancy in all the Sun Fire systems The Sun Fire 6800 system comes standard with two System Controller boards installed at the factory The processor on the board is a microSPARCT Ilep with its own POST OBP flash PROM and 8 Mbytes of DRAM The processor also has a 33 MHz PCI bus with two devices on it With two System Controller boards in each system if one System Controller board fails the other System Controller board can take control of the system without causing a disruption in the main system operation The System Controller board has a 10 100 BASE T Ethernet connection and an Ebus interface for a variety of devices These include a TOD NVRAM device flash PROM for extra NVRAM space a large 4 Mbyte flash PROM to hold the OS code and one 16552 dual serial port device The System Controller board performs the following main functions Sets up the system and coordinates the boot process Generates system clocks Monitors the environmental sensors throughout the system Analyzes errors and takes corrective action Sets up the system partitions and domains Provides the system console functionality The major feature
18. abilities raise its availability from the normal commercial category to the high availability category These capabilities are grouped as follows m Fault tolerant capabilities Any single point of failure is entirely transparent to users Users see no loss of performance or capability in the specific areas of the system that are fault tolerant m Resiliency capabilities These capabilities enable processing and data access to continue in spite of a failure possibly with reduced resources These capabilities usually require that you reboot your system m Serviceability capabilities These capabilities lower or eliminate the repair time when a failure occurs Cooling The Sun Fire system has redundant cooling If one fan fails the remaining fans automatically increase their speed thereby enabling the system to continue to operate even at the maximum specified ambient Therefore operation need not be suspended when a fan fails Also you can replace a fan while the system is operating again without any adverse impact on the availability metric Of course the Sun Fire system has comprehensive and fail safe temperature monitoring to ensure that there is no over temperature stressing of components in the event of a cooling failure Chapter 2 System Features and Capabilities 19 AC Power Supply AC power is supplied to the Sun Fire system through up to four independent 30 ampere single phase power supplies Each AC input module carrie
19. ave the flexibility with the remaining three systems to install them in industry standard 19 inch cabinets or have them preinstalled in a Sun Fire cabinet The Sun Fire cabinet can hold one Sun Fire 4810 system one Sun Fire 4800 system or a maximum of three Sun Fire 3800 systems Standard Features The standard features of these systems include Rackmountable in industry standard 19 inch rack Support for up to 24 CPUs and 192 Gbytes of memory Support for up to 32 I O slots PCI and CompactPCI I O modules Extensive redundancy System controllers Support for multiple domains Concurrent hardware maintenance Common components Redundant power and cooling 9 6 Gbyte bus bandwidth 1 The Sun Fire 6800 system is not rackmountable it comes already mounted in a cabinet 2 CompactPCI I O is available only in the Sun Fire 3800 system The Sun Fire family of servers of which the Sun Fire 6800 4810 4800 3800 systems are members share many common components as shown in the following table TABLE 1 1 Sun Fire Shared Components Components Sun Fire 6800 Sun Fire 4810 Sun Fire 4800 Sun Fire 3800 CPU Memory board x x x x CPU processors x x x x Memory DIMMs x x x x PCI I O assembly x x x NA CompactPCI I O assembly NA NA NA System Controller board x xX x 2 Repeater board x x x 3 1 The Sun Fire 3800 system only has CompactPCI I O assemblies 7 The System Controller board for the Sun Fi
20. ble nature of the address match logic in the memory controller Redundant Components Both the customer mean time between failure and the customer availability measures of the system are enhanced by the Sun Fire system s capability to configure redundant components There are no components in the system that cannot be configured redundantly if the customer desires Each system board is capable of independent operation The Sun Fire system is built with multiple system boards and is inherently capable of operating when only a subset of the configured boards are functional In addition to the basic system boards redundantly configurable components include System Controller boards Repeater boards Bulk power subsystems Bulk power supplies Peripheral controllers and channels You can configure systems with multiple connections to the peripheral devices enabling redundant controllers and channels Software maintains the multiple paths and can switch to an alternate path on the failure of the primary path Chapter 2 System Features and Capabilities 21 The system controller is controlled though a console interface workstation Redundant system controllers and interfaces can be configured if the customer desires Serviceability Capabilities To reduce repair time the Sun Fire system has been designed with a number of maintenance capabilities and aids These are used by the Sun Fire system administrator and by the service provider
21. caract res est prot g par un copyright et licenci par des fournisseurs de Sun Des parties de ce produit pourront tre d riv es des syst mes Berkeley BSD licenci s par l Universit de Californie UNIX est une marque d pos e aux Etats Unis et dans d autres pays et licenci e exclusivement par X Open Company Ltd Sun Sun Microsystems le logo Sun AnswerBook2 docs sun com Sun Fire et Solaris sont des marques de fabrique ou des marques d pos es ou marques de service de Sun Microsystems Inc aux Etats Unis et dans d autres pays Toutes les marques SPARC sont utilis es sous licence et sont des marques de fabrique ou des marques d pos es de SPARC International Inc aux Etats Unis et dans d autres pays Les produits portant les marques SPARC sont bas s sur une architecture d velopp e par Sun Microsystems Inc L interface d utilisation graphique OPEN LOOK et Sun a t d velopp e par Sun Microsystems Inc pour ses utilisateurs et licenci s Sun reconna t les efforts de pionniers de Xerox pour la recherche et le d veloppement du concept des interfaces d utilisation visuelle ou graphique pour l industrie de l informatique Sun d tient une licence non exclusive de Xerox sur l interface d utilisation graphique Xerox cette licence couvrant galement les licenci s de Sun qui mettent en place l interface d utilisation graphique OPEN LOOK et qui en outre se conforment aux licences crites de Sun CETTE PUBLICATION
22. cement 23 Domain Isolation 23 Nonconcurrent Service 23 Remote Service 23 3 Hardware Overview 25 Standard Operation 25 Data Interconnect 26 iv Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 Console Bus Interconnect 27 Sun Fire System Components 29 CPU Memory Board 29 I O Assemblies 29 PCII O 30 CompactPCII O 30 Repeater Board 30 System Controller Board 31 Redundant System Controllers 32 Virtual Domain Key Switches 32 Solaris Console 33 Virtual Time Of Day 33 Environmental Monitoring 33 Contents Vv vi Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 GURE 1 1 GURE 1 2 GURE 1 3 GURE 1 4 GURE 1 5 GURE 2 1 GURE 3 1 Figures Sun Fire Systems and Sun Fire Cabinet 2 Sun Fire 6800 System Cabinet Front and Rear Views 5 Sun Fire 4810 System Mounted in Optional Sun Fire Cabinet Front and Rear Views 7 Sun Fire 4800 System Front and Rear Views 9 Sun Fire 3800 System Mounted in Optional Sun Fire Cabinet Front and Rear Views 11 Partitions and Domains on a Sun Fire 6800 System 14 Standard Operation of the Sun Fire 6800 4810 4800 3800 Systems 26 vii viii Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 TABLE 1 1 TABLE 1 2 TABLE 1 3 TABLE 1 4 TABLE 1 5 Tables Sun Fire Shared Components 3 Sun Fire 6800 System Features Sun Fire 4810 System Features Sun Fire 4800 System Features Sun Fire 3800 System Features 10 x Sun Fire 6800 4810 4800
23. gh levels of availability are essential This is especially true for a large shared resource system such as the Sun Fire system Availability Goals for the Sun Fire System The Reliability Availability and Serviceability RAS goals for the Sun Fire system is to protect the integrity of the customers data and to maximize availability The focus is on three areas 18 Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 m Problem detection and isolation knowing what went wrong and ensuring that the problem is not propagated m Tolerance and recovery absorbing abnormal system behavior and fixing it or dynamically circumventing it m Redundancy replicating critical components To ensure data integrity at the hardware level all data is error correction code ECC protected and control buses are protected by parity checks out to the data on the disks These checks ensure that errors are contained For tolerance to errors resilience capabilities are designed into the Sun Fire to ensure that the system continues to operate even in a degraded mode Because it is a symmetrical multiprocessing system the Sun Fire system can function with one or more processors disabled In recovering from a problem the system is checked quickly to determine the fault and to ensure minimum downtime The system can be configured with redundant hardware to reduce downtime High Availability Capabilities of the Sun Fire System The Sun Fire system cap
24. l functions of detail m Standard operation simple SMP OS functions m Interconnect details of interconnect for OS boot and RAS features m Console bus interconnect details of how the System Controller board controls the system Standard Operation The standard operation is simply that of an SMP running an operating system with standard functions It consists of CPU Memory devices and I O devices connected through an interconnect bus Although the data interconnect is actually a crossbar switch it is logically a bus This is illustrated in FIGURE 3 1 25 EPCI P C I EPCI P C I EPCI P C LL I O controller 1 LL I O controller I O controller Y A Y A Data address bus Y A nag A Y Y y CPU data CPU data CPU data CPU switch CPU gt switch CPU lt a switch Memory Memory Memory FIGURE 3 1 Standard Operation of the Sun Fire 6800 4810 4800 3800 Systems Data Interconnect Although the standard operation of the Sun Fire system is that of a simple bus like interconnect it is actually a point to point switched interconnect with two levels of repeaters or switches The switch i
25. mblies 2 PCI System Controller boards 2 Repeater boards 2 Domains 2 maximum Power supplies 3 Power requirements 200 240 VAC Redundant cooling Yes Redundant AC input No Internal peripherals None Packaging Rackmountable or mounted in Sun Fire cabinet Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 Front Rear do este la oo oo oo oo eos Tass Tas CL CL olo Oojo o le o fs e Delo Ti 5 a 6 o o o o o o FIGURE 1 3 Sun Fire 4810 System Mounted in Optional Sun Fire Cabinet Front and Rear Views Chapter 1 Sun Fire Product Overview 7 8 Sun Fire 4800 System The Sun Fire 4800 system has support for three CPU Memory boards two I O assemblies two Repeater boards and two System Controller boards FIGURE 1 4 shows front and rear views of the Sun Fire 4800 s
26. r system configuration The standard redundant configurations are three DC power supplies for up to three system boards and six DC power supplies for a maximum configuration on the Sun Fire 6800 system Logic Boards The System Controller board contains the system controller interface as well as the clock source and the emergency shutdown logic Optionally you can configure two System Controller boards in the system for redundancy 20 Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 The Repeater CPU Memory boards and the I O subsystems hold the DC to DC converters that power the address repeater the system data controller the system data crossbar and all other ASICs If one Repeater board fails the system will continue to operate in a degraded mode which includes two of the four address buses and data buses Processor If you have a failure of an UltraSPARC III processor the dual Ultra SPARC III data switch the external cache SRA Ms or the associated port controller the failed processor can be isolated from the remainder of the system by a power on self test POST configuration step As long as there is at least one functioning processor available in the configuration the system can operate Memory When POST completes testing the memory subsystem any faulty banks of memory will be identified POST can then reconfigure the memory configuration using only reliable memory banks taking advantage of the highly configura
27. re will require a reboot and will impact users but a properly configured system will always be able to recover from any hardware failure The address path and data path are treated in slightly different ways The address path has two completely redundant repeaters A complete address repeater path requires two Repeater boards as the Address Repeater AR function is bit sliced across two ARs On the Sun Fire 6800 system the data path is bit sliced across all four Repeater boards for standard operation Optionally a single pair of Repeater boards can be used in double pumped mode so that full functionality although with lowered data bandwidth is retained The Repeater boards have active devices Because centerplanes are relatively hard to service the Sun Fire 6800 4810 4800 systems were designed so that no active devices are present on the centerplane The Sun Fire 3800 system however incorporates all the active components onto its centerplane Console Bus Interconnect The console bus enables the system controllers to read and write registers on the rest of the system Only one of the two SCs can be master on the console bus at a time Each system controller is connected to a console bus hub CBH and the two CBHs arbitrate for the use of the console bus Chapter 3 Hardware Overview 27 28 Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 CHAPTER 4 Sun Fire System Components The Sun Fire 6800 4810 480
28. re 3800 System Features Features Quantity or Description CPU Memory boards 2 CPUs 8 Maximum memory 64 Gbytes I O assemblies System Controller boards Domains Power supplies Power requirements Redundant cooling Redundant AC input Packaging 2 CompactPCT 2 2 maximum 3 100 120 VAC or 200 240 VAC Yes No Rackmountable or mounted in a Sun Fire cabinet Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 Front Rear o o o II o o_o o o o o o o o o o o e o ol o o k o o A Ee B Eeee D ooo CI tree nun le a Ba 4 ole o o o CI 0000 a8 o i E El o o o FIGURE 1 5 Sun Fire 3800 System Mounted in Optional Sun Fire Cabinet Front and Rear Views Chapter 1 Sun Fire Product Overview 11 12 Sun Fire 6800 4810 4800 3800 Systems Overview April 200
29. re 3800 system is unique 3 The Sun Fire 3800 system does not have Repeater boards Chapter 1 Sun Fire Product Overview 3 Machine Configurations Four machine configurations are available m Sun Fire 6800 system m Sun Fire 4810 rackmountable system m Sun Fire 3800 rackmountable system m Sun Fire 4800 deskside or rackmountable system Sun Fire 6800 System The Sun Fire 6800 system has support for six CPU Memory boards four I O assemblies four Repeater boards and two System Controller boards Although there are four Repeater boards they are logically two redundant repeaters two boards together make up one logical repeater FIGURE 1 2 shows front and rear views of the Sun Fire 6800 system cabinet TABLE 1 2 lists the features of the Sun Fire 6800 system TABLE 1 2 Sun Fire 6800 System Features Features Quantity or Description CPU Memory boards 6 CPUs 24 Maximum memory 192 Gbytes I O assemblies 4 PCI System Controller boards 2 Repeater boards 4 Domains Power supplies Power requirements Redundant cooling Redundant AC input Internal peripherals Packaging 4 maximum 6 200 240 VAC Yes Yes None However space is available in the cabinet for peripherals options Sun Fire 6800 cabinet 4 Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 Rear Front o o Ho xy G o XX CL KAKA LLC SX RS 8000
30. s capable of complex functions such as m Dividing the system into completely isolated partitions m Dividing the partition into logically isolated domains To boot the operating system and to exercise the functions listed above the system controller needs to be aware of the logical structure of the switch interconnect 26 Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 The Sun Fire 6800 system has six slots for CPU Memory boards The Sun Fire 4810 4800 systems have three slots for CPU Memory boards The Sun Fire 3800 system has two slots for CPU Memory boards Each CPU Memory board has up to four UltraSPARC III CPUs The CPU also includes a memory controller and each CPU can support one memory bank with up to eight DIMMs The Sun Fire 6800 system has four bays for the I O assemblies Two bays are included in the Sun Fire 4810 4800 3800 systems for I O assemblies The Sun Fire 3800 system has a CompactPCI only I O chassis Each PCI I O assembly has two I O controllers each with one 66 MHz PCI bus and one 33 MHz PCI bus The Sun Fire 6800 system is designed to greatly improve reliability serviceability and availability RAS over previous generations of systems The Sun Fire system is designed to be able to recover from any hardware failure Some failure recovery will not impact users for example a power supply failure if the system is configured for redundant power supplies Some failure recovery for example a CPU failu
31. s of the System Controller board are Redundant system controller Virtual domain key switches Network Solaris software console for each domain Virtual time of day for each domain Environmental monitoring Chapter 4 Sun Fire System Components 31 The system controller provides five ports domain A console domain B console domain C console domain D console and the system controller shell The system controller shell provides the following m Configuration control m Environmental status m Ability to reconfigure domains m Ability to power on and off power grids m Ability to change the system controller password m Other generic system controller functions The system controller software sequences the booting of the system by m Configuring hardware m Setting up domains m Powering on and off components such as system boards power supplies and fans m Testing components m Building the domains The system controller software provides tools for changing the configuration of the system and it also logs errors For more information on the system controller refer to the Sun Fire 6800 4810 4800 3800 System Controller Command Reference Manual Redundant System Controllers When two System Controller boards are installed in Sun Fire 6800 4810 4800 3800 systems the second board is a redundant System Controller board Each System Controller board can check the health and status of the other System Controller board If the main
32. s power to two or three 1 280 watt bulk power supplies The AC connections must be controlled by separate customer circuit breakers and can be on isolated power grids if a high level of availability is required Optionally third party battery backed up power can be used to provide AC power in the event of utility failure ECC On the Sun Fire system data errors are detected corrected and or reported by the data buffer on behalf of its associated processor Additionally data errors passing through the interconnection will be detected and will cause a record stop condition for the ASICs The ASICs detect and initiate this condition These history buffers and record stop condition bits can then be read and used by offline diagnostics Resiliency Capabilities Resiliency capabilities enable processing and data access to continue in spite of a failure possibly with reduced resources These capabilities usually require that you reboot the system and this is counted as repair time in the availability equation DC Power The Sun Fire logic DC power system is modular at the system board level Bulk 56 VDC is supplied through a circuit protector to each system board This 56 volts is converted through several small DC to DC converters to the specific low voltages needed on the board Failure of a DC to DC converter affects only that particular system board You need to configure only as many bulk DC power supplies as are needed for the particula
33. si amp Sun microsystems Sun Fire 6800 4810 4800 3800 Systems Overview Sun Microsystems Inc 901 San Antonio Road Palo Alto CA 94303 U S A 650 960 1300 Part No 805 7362 11 April 2001 Revision A Send comments about this document to docfeedback sun com Copyright 2001 Sun Microsystems Inc 901 San Antonio Road Palo Alto CA 94303 U S A All rights reserved This product or document is protected by copyright and distributed under licenses restricting its use copying distribution and decompilation No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors if any Third party software including font technology is copyrighted and licensed from Sun suppliers Parts of the product may be derived from Berkeley BSD systems licensed from the University of California UNIX is a registered trademark in the U S and other countries exclusively licensed through X Open Company Ltd Sun Sun Microsystems the Sun logo AnswerBook2 docs sun com Sun Fire and Solaris are trademarks registered trademarks or service marks of Sun Microsystems Inc in the U S and other countries All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International Inc in the U S and other countries Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems Inc The OPEN LOOK and Sun
34. some form of redundant check mechanism Transmission of bad data is thus detected preventing propagation of bad data without notification AIl uncorrectable errors will result in an error condition The recovery requires an operating system automatic reboot Multiple Bit Data Errors Multiple bit ECC errors are detected by the receiving port which notifies the operating system so that depending upon what process is affected the system as a whole can avoid failure Parity errors on external cache reads to the interconnect become multibit ECC data errors and are handled as other multibit errors Multiple Bit Address Errors Multiple bit ECC errors detected in the address interconnect are unrecoverable and are fatal to the operating system System Time Out Errors Time out errors detected by the port controller or memory controller are an indication of lost transactions Time outs are therefore always unrecoverable Power Corrected Failures The Sun Fire system uses a highly reliable distributed power system Each I O subsystem CPU Memory board System Controller board or Repeater board within the system has DC to DC converters for that board only with multiple converters for each voltage When a DC to DC converter fails the system controller is notified The system board reporting the failure will then be deconfigured from the system No guarantee is made regarding continued system operation at the time of the failure Chapter 2 S
35. ystem TABLE 1 4 lists the features of the Sun Fire 4800 system TABLE 1 4 Sun Fire 4800 System Features Features Quantity or Description CPU Memory boards 3 CPUs 12 Maximum memory 96 Gbytes I O assemblies 2 PCI System Controller boards 2 Repeater boards 2 Domains 2 maximum Power supplies 3 Power requirements 200 240 VAC Redundant cooling Yes Redundant AC input No Internal peripherals None Packaging Rackmountable deskside or mounted in a Sun Fire cabinet Sun Fire 6800 4810 4800 3800 Systems Overview April 2001 Front Rear RH pr H S LE Wi A I I gt gt TX D I I DI I S D D gt TA SI gt o co 0 Tere 1
36. ystem Features and Capabilities 17 Environmental Sensing The system chassis environment is monitored for key measures of system stability such as temperature airflow and power supply performance The system controller is constantly monitoring the system environmental sensors in order to have enough advance warning of a potential condition so that the machine can be brought gracefully to a halt avoiding physical damage to the system and possible corruption of data Temperature The internal temperature of the system is monitored at key locations as a fail safe mechanism Based on temperature readings the system can notify the administrator of a potential problem begin an orderly shutdown or power off the system immediately Power Subsystem The Sun Fire system performs additional sensing to enhance the reliability by enabling constant health checks DC voltages are monitored at key points within the system DC current from each power supply is monitored and reported to the system controller The CPU power control will shut down any overheating CPU without shutting down the system The reset signals in the Sun Fire system is sequenced with the DC power levels to guarantee stability of voltage throughout the cabinet prior to removing the reset signals and enabling normal operation of the Sun Fire system logic Availability For organizations whose goal is to make information instantly available to users across the enterprise hi
37. ystem controller shuts down various components to prevent damage Chapter 4 Sun Fire System Components 33 34 Sun Fire 6800 4810 4800 3800 Systems Overview April 2001

Download Pdf Manuals

image

Related Search

Related Contents

SAMEDI 16 MARS 2013 à 14 h 30  SERIAL BUS DEVICENET™ ADAPTERS  Oxygen Therapy and Medical Suction Catalogue  manual tecnico    Patara HP Green Laser - CEO Knowledge Center  Schwinn 10 Series Bike Assembly Manual  DSW-3519 / DSW-3522 /DSC-3219  ANNEXE I RÉSUMÉ DES CARACTÉRISTIQUES DU PRODUIT  Formation Debian GNU/Linux - La page d`accueil du P:L:O:U:G  

Copyright © All rights reserved.
Failed to retrieve file