Home
Sun StorEdge™ T3 and T3+ Array Field Service Manual
Contents
1. lt l gt fru list ID TYPE VENDOR MODEL REVISION SERIAL uID TYPE VENDOR MODEL REVISION SERIAL ulctr controller card 0301 501 5710 02 020100 020101 112035 u2ctr controller card 0301 501 5710 02 020100 020101 112122 uldl disk drive SEAGATE ST336704FSUN A726 3CD1HMKJ uld2 disk drive SEAGATE ST336704FSUN A726 3CD1HH2A uld3 disk drive SEAGATE ST336704FSUN A726 3CD1H9WS uld4 disk drive SEAGATE ST336704FSUN A726 3CD1HM64 uld5 disk drive SEAGATE ST336704FSUN A726 3CD1HMC2 uld6 disk drive SEAGATE ST336704FSUN A726 3CD1HM63 uld7 disk drive SEAGATE ST336704FSUN A726 3CD1HE3A uld8 disk drive SEAGATE ST336704FSUN A726 3CD1HNKO uld9 disk drive SEAGATE ST336704FSUN A726 3CD1HM5P u2d1 disk drive SEAGATE ST336704FSUN A726 3CD1HHH5 u2d2 disk drive SEAGATE ST336704FSUN A726 3CD1HMJC u2d3 disk drive SEAGATE ST336704FSUN A726 3CD1HGKR u2d4 disk drive SEAGATE ST336704FSUN A726 3CD1HLBJ u2d5 disk drive SEAGATE ST336704FSUN A726 3CD1HNHO u2d6 disk drive SEAGATE ST336704FSUN A726 3CD1HHAZ u2d7 disk drive SEAGATE ST336704FSUN A726 3CD1H92W u2d8 disk drive SEAGATE ST336704FSUN A726 3CD1HN9T u2d9 disk drive SEAGATE ST336704FSUN A726 3CD1HKOP ulll loop card SCI SJ 375 0085 01 5 02 Flash 1413 ull2 loop card SCI SJ 375 0085 01 5 02 Flash 2294 u2 11 loop card SCI SJ 375 0085 01 5 02 Flash 001415 u21
2. ID TYPE VENDOR MODEL REVISION SERIAL ulctr controller card 0301 501 5710 02 020100 020101 112035 u2ctr controller card 0301 501 5710 02 020100 020101 112122 uldl disk drive SEAGATE ST336704FSUN A726 3CD1HMKJ uld2 disk drive SEAGATE ST336704FSUN A726 3CD1HH2A uld3 disk drive SEAGATE ST336704FSUN A726 3CD1H9WS uld4 disk drive SEAGATE ST336704FSUN A726 3CD1HM64 uld5 disk drive SEAGATE ST336704FSUN A726 3CD1HMC2 uld6 disk drive SEAGATE ST336704FSUN A726 3CD1HM63 uld7 disk drive SEAGATE ST336704FSUN A726 3CD1HE3A uld8 disk drive SEAGATE ST336704FSUN A726 3CD1HNKO uld9 disk drive SEAGATE ST336704FSUN A726 3CD1HM5P u2d1 disk drive SEAGATE ST336704FSUN A726 3CD1HHH5 u2d2 disk drive SEAGATE ST336704FSUN A726 3CD1HMJC u2d3 disk drive SEAGATE ST336704FSUN A726 3CD1HGKR u2d4 disk drive SEAGATE ST336704FSUN A726 3CD1HLBJ u2d5 disk drive SEAGATE ST336704FSUN A726 3CD1HNHO u2d6 disk drive SEAGATE ST336704FSUN A726 3CD1HHAZ u2d7 disk drive SEAGATE ST336704FSUN A726 3CD1H92W u2d8 disk drive SEAGATE ST336704FSUN A726 3CD1HN9T u2d9 disk drive SEAGATE ST336704FSUN A726 3CD1HKOP ulll loop card SCI SJ 375 0085 01 5 02 Flash 1413 ull2 loop card SCI SJ 375 0085 01 5 02 Flash 2294 u2l1 loop card SCI SJ 375 0085 01 5 02 Flash 001415 u212 loop card SCI SJ 375 0085 01 5 02 Flash 002054 ulpcul power cooling unit TECTROL CAN 300 1454 01 0000 001787 u
3. 7 Power off both units a Type lt 5 gt shutdown Shutdown the system are you sure N y b Press the power button once on each power and cooling unit to turn the switch off 8 Remove the interconnect cables from the back of each array 150 Sun StorEdge T3 Array Field Service Manual November 2002 DENTE Casale o gt te Interconnect cables gt gt o ji ll FIGURE 10 4 Interconnect Cable Location At this point you can physically move the arrays apart If you are moving the arrays to different locations remove the other cables Replace all cables except the interconnect cable when the arrays are at their permanent locations Note Do not power on the arrays until you complete the instructions in Establishing a New IP Address on page 151 Establishing a New IP Address In a partner group the alternate master unit assumes the IP address of the master unit When the partner group is disconnected assign a new IP address to the previous alternate master unit for it to operate as a single controller unit The JumpStart feature automatically downloads a newly assigned IP address to the array To enable this feature you must edit your host file on a RARP server before powering on the array After you power on the IP address is automa
4. Found units ul ctr auto boot is enabled hit the RETURN key within 3 seconds to cancel Starting T3B Release 2 00 2001 04 02 15 21 29 129 150 28 81 Copyright C 1997 2002 Sun Microsystems Inc All Rights Reserved Initializing software Found units ul ctr Default master is ul Starting Heartbeats Assigning Select IDs ul 1 Initializing system drivers Initializing XPT component Initializing QLCF component Initializing loop 1 ISP2200 firmware status 3 Detected 10 FC AL ports on loop 1 Initializing loop 2 ISP2200 firmware status 3 Detected 10 FC AL ports on loop 2 Initializing SVD services Detected data cache size in system 1GB Testing ISP2200 Passed Testing ECC mechanism Passed Testing XOR functions and datapaths Passed Cold Boot detected destructive tests OK Testing data cache memory Passed Initializing Cache Memory Initializing system DB structure Initializing configuration Initializing port configuration Initializing loop 2 to accept SCSI commands Mounting root volume Checking local file system Initializing network routes Read PGR data Done Starting Syslog Daemon System has 1 active controller s Initializing TFTP Starting ftpd Starting telnetd Starting timed Starting pshd Starting httpd Starting snmpd Starting schd 196 Sun StorEdge T3
5. Green Amber Descriptions Off Off Drive not installed not recognized Slow blink Off Drive is spinning up or down Solid Off Drive OK idle Chapter 5 Disks and Drives 63 TABLE 5 2 Disk Drive LED Descriptions Drive Activity Drive Status Green Amber Descriptions Flashing Off Drive OK activity Off Solid Drive reconstruct firmware download in progress Off Slow blink Drive failure OK to replace drive Note Even if the LED indicates a drive failure always verify the FRU status using the CLI before replacing the drive Refer to Checking FRU Status on page 35 for instructions Repairing Disk Drives lost Before replacing another disk drive in the same array complete any volume reconstructions before and ensure that the disk drive is fully functional and in operation 1 Caution Replace only one disk drive in a array at a time to ensure that no data is By default the array automatically spins up and reenables a replaced disk drive then automatically reconstructs the data from the parity or hot spare disk drives The disk drive spinup takes about 30 seconds Reconstruction of the data on the disk drive can take up to several hours depending on system activity Note For the array to automatically reconstruct drive data the array must remain powered on while a disk is replaced Removing and Replacing a Disk Drive 1 Observe static electricity precautions See Static Electricity
6. chatter between the controllers on loop 2 when ever a cache flush occurs common system events like a battery refresh or reporting the temperature of the loop cards and host related events like reboots etc However an error or warning will often trigger a cascade of notice messages indicating LUN takeovers cache flushing etc Like LIPs on an FCAL a few are ok and expected but you should pay attention to storms of them or freguent repeats of the same message Note that they often contain useful debugging information that can help determine the root cause of a failure When you see these patterns look back in the syslog for the E rror or W arning that precipitated them Interpreting ITL Messages in an FCAL Environment I nitiator T arget L UN messages record SCSI commands being received by the various port monitoring tasks They are common and freguently not a cause for concern Basic Example ITL 7D 5 1 TT 20 TID 9CA8 OP 0 Target in Unit Attention where a ITL I nitiator T arget L UN a TT Tag Type 20 and a tag type of 20 is a Simple Queue Tag m TID Tag ID 9CA8 which is the unique for this 1 0 Tag Id number m OP Code SCSI OP code 0 which is Test Unit Ready 178 Sun StorEdge T3 Array Field Service Manual November 2002 Note The initiator can be verified using luxadm e dump_map lt device gt Other common initiator HBA ID s are 7C dec 124 and 7B dec 123 So the things
7. 2 2 22 22 22 22 22 ISR1 ISR1 ISR1 ISR1 ISR1 ISR1 1 1 se ydr 1 1 Saag 22 uld9 uld8 uld7 uld6 uld5 uld4 SVD_PA SVD_PA SVD_PA SVD_PA SVD_PA SVD_PA H FAILOVER H FAILOVER H FAILOVER H FAILOVER H FAILOVER H FAILOVER path id path id path id path id path id path id oooooo As can be seen the data is the similar to what Storage Automated Diagnostic Environment message monitoring would display and indicates the same possible failure condition on loop 1 path 0 Example syslog Error Messages CODE EXAMPLE 8 11 displays some example syslog error messages that might indicate a back end FC AL drive loop problem CODE EXAMPLE 8 11 Drive Loop Problem Example Error Messages Sep 27 18 36 53 T3A ROO W Sep 27 18 48 46 T3A ISRI W 0x1 lun 0x0 Sep 28 06 52 23 T3A CFG W Sep 28 06 53 49 T3A LPC E Sep 28 06 53 49 T3A TMR E minutes Sep 28 06 53 49 T3A LPCT 1 E Sep 28 07 01 41 T3A ISR1 2 W Sep 28 07 01 41 T3A ISR1 2 W Ready Initializing CMD Requi Sep 28 07 01 41 T3A ISR1 2 W Sep 28 07 01 41 T3A ISR1 2 W Sep 28 07 01 41 T3A WXFT 2 W Sep 28 07 01 41 T3A WXFT 2 W disable Sep 28 07 10 27 T3A LTO1 1 W Sep 28 07 15 05 T3A ISR1 1 W 0x1 lun 0x0 Sep 28 07 15 05 T3A ISRI 1 W Sep 28 07 18 03 T3A ISR1 1 W ir ulctr Hardware Reset 1000 occurred SC
8. Unlock the controller card by pushing in on the latch handles Use a coin or small screwdriver to press in and release the latch handle Chapter 4 Controller Card Assembly 49 Sun StorEdge T3 array controller card Latch handle FIGURE 4 2 Removing the Controller Card 6 Pull the controller card out using the latch handles 7 Insert the new controller card 8 Lock the new controller card by pushing in the latch handles Use a coin or small screwdriver to press in and secure the latch handle 9 Insert the fiber optic cable and MIA for T3 controllers back into the FC AL connector 10 Insert the Ethernet cable into the Ethernet port 11 Check the controller status LED to determine when the controller is operational While the controller boots the controller status LED is solid amber When the controller is operational the LED is green 12 Verify the status of the controller card using the CLI Refer to Checking FRU Status on page 35 for instructions Note In a partner group configuration the controller fails over to the alternate master when there is a controller card failure in a master unit After the controller board is replaced use the reset command if you wish to have ul become the master again 50 Sun StorEdge T3 Array Field Service Manual November 2002 Upgrading Controller Firmware The controller firmware can be upgraded on an operational system However for the upgrade to take effect
9. il convient de prendre pour l installation d un produit Sun Microsystems Mesures de s curit Pour votre protection veuillez prendre les pr cautions suivantes pendant l installation du mat riel Suivre tous les avertissements et toutes les instructions inscrites sur le mat riel V rifier que la tension et la fr quence de la source d alimentation lectrique correspondent la tension et la fr quence indiqu es sur l tiquette de classification de l appareil Ne jamais introduire d objets quels qu ils soient dans une des ouvertures de l appareil Vous pourriez vous trouver en pr sence de hautes tensions dangereuses Tout objet conducteur introduit de la sorte pourrait produire un court circuit gui entra nerait des flammes des risques d lectrocution ou des d g ts mat riels Symboles Vous trouverez ci dessous la signification des diff rents symboles utilis s Caution Attention risques de blessures corporelles et de d gats mat riels Veuillez suivre les instructions Caution Attention surface temp rature lev e Evitez le contact La temp rature des surfaces est lev e et leur contact peut provoguer des blessures corporelles Caution Attention pr sence de tensions dangereuses Pour viter les risques d lectrocution et de danger pour la sant physique veuillez suivre les instructions Caution MARCHE votre syst me e
10. verify 171 verifying firmware level 33 vol command adding a volume 147 checking data parity 62 rebuilding a replaced FRU 68 verify subcommand 62 vol disable command 122 vol list command 61 vol mode command 99 vol mount command 73 vol recon command 68 123 vol stat command 60 vol unmount command 72 vol verify command 62 volume defaults 171 WWN 132 volumes disabling 122 mounting 123 reconstruction of 123 unmounting 71 118 W warning message type 174 severity level 3 web site SunSolve 32 worksheets 221 WWN 132 Index 235 236 Sun StorEdge T3 Array Field Service Manual November 2002
11. 300 EP gt set tftphost IP_address 300 EP gt Set tftpfile controller_binary 300 EP gt set bootmode tftp bootdelay 3 sn XXXXXX ip 10 1 102 112 netmask 2222525550 gateway XXX XXX XXX XXX tftphost xxx XXX XXX XXX tftpfile nb210 bin hostname T3 spindelay 0 revision 0210 mac XXIXXIXXIXXIXX rarp on 9 Reset the array T300 EP gt reset 10 Observe the boot cycle m If the system is able to boot to a normal login prompt proceed to Step 11 m If the array continues to boot in a cycle stop the cycle and break to the diagnostic menu by pressing Ctrl t and continue pressing at one second intervals until the booting stops Press Return at the diagnostic menu prompt and continue below i From the diagnostic menu select Ouit but go into Label Control Menu ii From the Label Control Menu select Wipe out unit 1 Sysarea LFS iii Select Ouit All The system should continue the boot cycle iv Verify the system boots to the login prompt and log in as the root user 42 Sun StorEdge T3 Array Field Service Manual November 2002 v Use the appropriate patch to execute the t3 sh script to restore the missing files to the array local file system Sun StorEdge T3 controller patch 109115 Sun StorEdge T3 controller patch 112276 11 Install the boot code by typing T3 lt 1 gt boot i nb210 bin 12 Set the boot mode to auto by typing T3 lt 2
12. 33 interconnect card 33 level 32 upgrading 51 70 79 verifying level 33 FLASH memory device 51 flow charts 22 FMD see FLASH memory device format command 26 front panel replacing 65 FRU identifiers 3 217 fru list command 34 67 fru myuid command 29 fru stat command 35 70 98 ftp 12 G gateway 169 H hardware reset log type 191 host generated messages 2 hostname 170 hosts file 17 hot spare checking 61 I id read command 88 inetd conf file 16 information message type 174 severity level 3 init 171 232 Sun StorEdge T3 Array Field Service Manual November 2002 installation setting the IP address 142 interconnect assemblies 167 interconnect cable connection 140 interconnect cards 216 217 assembly 163 firmware 79 FRU identifiers 217 LEDs 76 removing and replacing 77 upgrading firmware 79 iostat output 107 ip 169 L LAC Reserve 114 LEDs controller cards 47 interconnect cards 76 power and cooling unit 83 logging remote 17 logical unit numbers see LUNs 1 loglevel 170 logto 170 loop stat command 101 loop identifiers 219 loop problems 95 baseline data 107 diagnosis 96 105 error messages 109 indicators 106 normal status 96 Product Watch messages 108 repair procedures 96 115 syslog file 108 LUNs 1 M mac 170 MAC addresses location 37 127 130 maintenance precaution 2 memsize 171 messages syntax 3 174 types 174
13. Sense Key Explanations on page 211 and the following web site to decipher these http www tl0 org lists lspc lst htm Line 3 An explanation of the sense key see list below Line 4 Not useful 186 Sun StorEdge T3 Array Field Service Manual November 2002 Examples Recoverable 09 58 43 ISR1 1 N u1d3 SCSI Disk Error Occurred path 0x1 09 58 43 ISR1 1 N Sense Key 0x1 Asc 0x17 Ascq 0x1 09 58 43 ISR1 1 N Sense Data Description Recovered Data With Retries 09 58 43 ISR1 1 N Valid Information 0x26af795 09 58 58 ISR1 1 N u1d3 SCSI Disk Error Occurred path 0x1 09 58 58 ISR1 1 N Sense Key 0x1 Asc 0x18 Ascq 0x2 09 58 58 ISR1 1 N Sense Data Description Recovered Data Data Auto Reallocated 09 58 58 ISR1 1 N Valid Information 0x26af795 The errors above indicate that the drive had a problem and was able to resolve it by the drive itself re reading the information and marking a sector bad and auto reallocating the data to an alternate sector Parity Errors 12 39 06 ISR1 2 W u2d6 SCSI Disk Error Occurred path 0x0 12 39 06 ISR1 2 W Sense Key Oxb Asc 0x47 Ascq 0x0 12 39 06 ISR1 2 W Sense Data Description SCSI Parity Error 12 39 06 ISR1 2 W Valid Information 0x3379602 Common Host Port FCC0 Messages 13 42 41 FCCO 1 N ulctr IDE received on port 0 abort 0 where ID E Initiator Detected Error A
14. Separate power cords are used for the connector on each power and cooling unit to provide redundant cabling The power cords need to be connected to separate AC power sources for full redundancy 81 82 Power switches 22 DI o oo G2 0 ED o FIGURE 7 1 Power Cords Connected to the Power and Cooling Units Bo Bo Caution Do not handle the power and cooling unit when the power cord is connected Line voltages are present within the power and cooling unit when the power cord is connected even if the power switch is off At the rear of the power and cooling unit is a recessed PC card connector Do not touch this connector or allow any metal object to touch it The power and cooling unit contains the UPS battery backup Note The batteries in the power and cooling units recharge after powering on the array If the batteries are less than fully charged fru stat output displays batteries in a fault condition and write behind cache is disabled until the batteries are charged The system can take several hours to determine the health of the batteries after the system is turned back on Batteries reflect a non optimal state after power loss events and also after turning off power switches Sun StorEdge T3 Array Field Service Manual November 2002 Power and Cooling Unit LEDs Each of the power and cooling units has an AC LED and a pow
15. TIME Time Daemon HT00 Process HTTP connections HTPD Listen for HTTP connections SNMP Process SNMP requests Pshd Shell Daemon This spawns individual shell task Pshc Execute shell commands Tnpd Telnet daemon 202 Sun StorEdge T3 Array Field Service Manual November 2002 Internal Sun StorEdge T3 Array AL_PA LID LOOP Map TABLEC 8 Internal Sun StorEdge T3 Array AL_PA LID LOOP Map Device al_pa loop_id Target LID Order Loop uld3 d5 Oxa 10 3 1 0 uld2 d6 0x9 9 2 2 0 uld1 d9 0x8 6 1 3 0 u2d9 36 0x62 98 17 4 1 u2d8 c5 0x17 23 16 5 1 u2d7 c6 0x16 22 15 6 1 u2ctr e8 0x01 1 N A 7 N A u2d3 cb 0x12 18 11 8 0 u2d2 cc 0x11 17 10 9 0 u2d1 cd 0x10 16 9 10 0 u2d6 c7 0x15 21 14 11 1 u2d5 c9 0x14 20 13 12 1 u2d4 ca 0x13 19 12 13 1 uld6 d2 Oxd 13 6 14 1 uld5 d3 Oxc 12 5 15 1 uld4 d4 Oxb 11 4 16 1 uld9 39 0x61 97 18 17 1 uld8 ce Oxf 15 8 18 1 uld7 d1 Oxe 14 y 19 1 ulctr ef 0x00 0 N A 20 N A AppendixC Sun StorEdge T3 Array Messages 203 SCSI Virtual Disk Driver SVD Error Definitions TABLEC 9 SVD Disk Error Definitions Opcode Error 0x0 Request in progress 0x1 Completed without error 0x2 Retry attempted 0x3 Completed with error 0x4 Retries exhausted 0x5 LBA out of range 0x6 I O enqueue failure 0x7 Invalid command specified 0x8 resource not available 0x9 Invalid command specified OxA Device already open OxB Device exclusively opened 0xC Resource not available 0xD On disk label not found OxE I
16. and SADE with 2nd copy of SADE application running on management host with Ethernet connection to array CLI E CLI S CLI E CLI S m LED Light emitting diodes on the array m CLI E Command line utilities run via ethernet connection as described in Sun StorEdge T3 Array Administrator s Manual m CLI S Command line utilities run via a serial connection as described in Establishing a Serial Port Connection on page 7 m OFDG Off line Drive Diagnostic utility as described in Using the ofdg Diagnostic Utility on page 111 m SNMP Simple Network Monitoring Protocol as described in Sun StorEdge T3 Array Administrator s Manual m SNMP CA Simple Network Monitoring Protocol used with a customer written application as described in Sun StorEdge T3 Array Administrator s Manual Sun StorEdge T3 Array Field Service Manual November 2002 SADE The Storage Automated Diagnostic Environment application as described in Storage Automated Diagnostic Environment User s Guide syslog Sun StorEdge T3 array syslog file syslog CA Sun StorEdge T3 array syslog with customer written application SRS Sun Remote Service Chapter 3 Diagnosing T3 Array Problems 21 Troubleshooting Flow Charts The following three charts illustrate typical diagnostic procedures Unable to communicate to the volume from the data host or excessive number more than 10 in 24 hours of online offline messages in t
17. l utilisation de xii Sun StorEdge T3 Array Field Service Manual November 2002 Normativas de seguridad El siguiente texto incluye las medidas de seguridad que se deben seguir cuando se instale alg n producto de Sun Microsystems Precauciones de seguridad Para su protecci n observe las siguientes medidas de seguridad cuando manipule su equipo Siga todos los avisos e instrucciones que se indican en el equipo Aseg rese de que el voltaje y la frecuencia de la red el ctrica concuerdan con las descritas en las etiquetas de especificaciones el ctricas del equipo e No introduzca nunca objetos de ning n tipo a trav s de los orificios del equipo El voltaje puede ser peligroso Los objetos extrafios conductores de la electricidad pueden producir cortocircuitos que provoquen un incendio descargas el ctricas o da os en el equipo Simbolos En este libro aparecen los siguientes simbolos i Caution Precauci n Existe el riesgo de lesiones personales y dafios al equipo Siga las instrucciones Caution Precauci n Superficie caliente Evite el contacto Las superficies estan calientes y pueden causar da os personales si se tocan presente Para reducir el riesgo de descarga y f Caution Precauci n Voltaje peligroso da os para la salud siga las instrucciones Caution Encendido Aplica la alimentaci n de CA al sistema Seg n el tipo de interruptor de encen
18. loglevel 3 rarp off mac 00 20 2 00 03 b9 You will need these values in step Step 18 118 Sun StorEdge T3 Array Field Service Manual November 2002 9 From the host Tip session set the logto to 1 and the loglevel to 4 lt 5 gt set logto 1 lt 6 gt set loglevel 4 These settings display all messages to the Tip session screen The output includes all messages from information up to error 10 Run a find test against loop 1 lt 7 gt ofdg find ulll WARNING Volume data will be offline while OFDG is running Continue N y How far the test has go into the loop to identify the failed FRU determines how long the test runs The find test may also have to be run again with the u211 parameter if no failures are found with the u111 parameter 11 Examine the output in detail to identify the failed FRU For comparison a test run that found no errors is shown in CODE EXAMPLE 8 13 This test might take 8 minutes to complete CODE EXAMPLE 8 13 ofdg Sample Output No Errors n lt 8 gt ofdg find ulll WARNING Volume data will be offline while OFDG is running Continue N y ONDG Initiated FIND Initiated on ulll Loop 1 Configured as lt 1 gt Loop 2 Not Available Loop 1 Configured as lt 1 gt Loop 2 Not Available Loop 1 Configured as lt 1 gt Loop 2 Not Available Loop 1 Configured as lt 1 gt Loop 2 Not Available FIND Completed on ulll STATUS PASS ul PASS ONDG C
19. midplane 125 etc ethers file 131 etc hosts 131 etc nsswitch conf 131 disk positions 129 MAC address 130 partner groups 126 130 replacement 126 mirror 170 model 170 mp_support 170 N netmask 169 notice message type 174 Notice severity level 3 notice message see messages nsswitch conf file 131 O ofdg utility 111 117 example 117 fast_find option 112 114 find option 114 go no go 112 health_check option 113 LUN assignments 111 options 111 reguirements 111 off line diagnostics see of dg utility P partner group 1 fully cabled 141 PATH_POLICY 103 PCU see power and cooling unit port command 171 Index 233 port list command 31 port listmap command 100 power and cooling unit 81 164 LEDs 83 removing and replacing 85 PPATH 102 proc list command 68 Product Watch messages 108 pSOSFail reset log type 191 R RAID controller see controller cards RAID volumes 1 RAIDFail reset log type 191 rarp 170 RARP daemon 131 rd_ahead 170 recon_rate 171 refresh s command 88 remote logging 17 reserved system area recovery 40 reset log types 191 reset y command 73 revision 170 S Safety Agency Compliance statements French xi German ix Spanish xiii SCSI Disk Error Occurred 109 SCSI Parity Error 109 serial number location 37 127 130 set command 169 setting the IP address 142 shell prompts xxx shutdown command 126 Simple Network Managemen
20. the controller must be reset booted While the controller boots the array is not available for storage The firmware upgrade procedures that follow must be done through the Ethernet connection The latest firmware version is located on the SunSolve web site http sunsolve sun com The following conditions apply to firmware upgrades m The firmware has to be resident on the host for this operation m The Sun StorEdge T3 array has to have a root password prior to attempting this procedure To upgrade the firmware see the Sun StorEdge T3 Array Installation and Configuration Manual Controller EPROM Firmware The EPROM firmware is stored in the FLASH memory device FMD on the controller card The array can be operational during the EPROM firmware upgrade Note To upgrade the EPROM firmware in a partner group you need to perform this procedure only once for both units to be upgraded The latest firmware versions are located on the SunSolve web site http sunsolve sun com Firmware is released as a patch which consists of an entire tar file with an automated uploader script that copies the files including the ep and 1pc images to the Sun StorEdge T3 array being upgraded Chapter 4 Controller Card Assembly 51 Firmware Upgrade Discussion Boot Code Explanation There are three levels of boot code plus an extended post code for factory testing m The first level selects and jumps to one of the two copies of t
21. 1 6 Inch Disk Drive Specifications Sun Enterprise 6x00 5x00 4x00 3x00 Systems SBus and Graphics I O Boards Installation Guide Sun StorEdge PCI FC 100 Host Adapter Installation Sun StorEdge SBus FC 100 Host Adapter Installation and Service Manual Sun StorEdge PCI Single Fibre Channel Network Adapter Installation Guide Sun StorEdge PCI Dual Fibre Channel Host Adapter Installation Guide Part Number 816 4771 816 4768 816 0774 816 0778 816 4769 816 4770 806 7979 806 1493 806 6383 806 4800 805 2704 805 3682 806 7532 806 7532 806 4199 Preface xxxi Application Title Sun StorEdge Compact PCI Dual Fibre Channel Network Adapter Installation and User s Guide Testing the array Storage Automated Diagnostic Environment User s Guide Storage Automated Diagnostic Environment Version 2 0 06 010 Release Notes Part Number 816 0241 816 3142 816 3141 1 Can be found at http webhome central storade Accessing Sun Documentation Online You can access a select group of Sun technical documentation on the Web You can browse the documentation archive at http www sun com products n solutions hardware docs Sun Welcomes Your Comments Sun is interested in improving its documentation and welcomes your comments and suggestions You can email your comments to Sun at docfeedback sun com Please include the part number 816 4774 10 of your document in the
22. 2002 Total time elapsed 22 hours 0 minutes 48 seconds refresh s No battery refreshing Task is currently running PCUL PCU2 Ul Normal Normal U2 Normal Normal Wed Aug 21 16 45 36 GMT 2002 November 2002 Battery Maintenance The battery refresh cycle occurs automatically once every 28 days The battery refresh cycle is sequential ensuring that only one battery in a unit is refreshed at a time The refresh cycle consists of a 6 minute discharge period followed by a recharge period of 6 to 12 hours The refresh cycle verifies the health of the battery During the refresh if a problem is detected with the battery future refresh operations are suspended until the problem is fixed When refresh is suspended battery write behind caching is turned off automatically as a safety precaution The syslog file indicates battery refresh operation in progress Use the refresh s command to view an active refresh operation Refer to the Sun StorEdge T3 Array Administrator s Manual for more information on this command Refresh cycle time is controlled by the array s etc schd conf file For example specify that a battery refresh cycle begin on January 15 2001 at 11 p m the entry in the etc schd conf file cat etc schd conf BEG 1 15 2001 23 00 00 T CYC 28 You can tune the etc schd conf file to specify the interval between battery refresh cycles and initiate a refresh on a particular day To specify beginn
23. Array Field Service Manual November 2002 Checking disk positions Initializing host port ulpl ISP2200 firmware status 7 Host port ulpl TARGET_ID Oxffff ALPA 0x5 Starting psh Login Appendix C Sun StorEdge T3 Array Messages 197 Sun StorEdge T3 Array Enterprise Configuration T3B 2 Starting POST POST end Starting T3B EP Release 2 01 2002 07 30 16 33 52 129 150 28 80 Copyright C 1997 2002 Sun Microsystems Inc All Rights Reserved Found units ul ctr u2 ctr auto boot is enabled hit the RETURN key within 3 seconds to cancel Starting T3B Release 2 01 2002 07 30 15 21 29 129 150 28 80 Copyright C 1997 2002 Sun Microsystems Inc All Rights Reserved Initializing software Found units ul ctr u2 ctr Default master is ul Default alternate master is u2 Master coming up Starting Heartbeats Assigning Select IDs ul 1 u2 2 Initializing system drivers Initializing XPT component Initializing OLCF component Initializing loop 1 ISP2200 firmware status 3 Detected 19 FC AL ports on loop 1 Initializing loop 2 ISP2200 firmware status 3 Detected 19 FC AL ports on loop 2 Initializing SVD services Detected data cache size in system 1GB Testing ISP2200 Passed Testing ECC mechanism Passed Testing XOR functions and datapaths Passed Cold Boot detected destructive tests OK 198 Sun StorEdge T3 Array F
24. Class B digital apparatus complies with Canadian ICES 003 Cet appareil num rique de la classe B est conforme a la norme NMB 003 du Canada VCCI DT 27AAVCCI RB lc gt wc J32AAVCCOSESRANA27J 7A7 vavbkOzTZva ARMI 77 AATWYRNIE MRC cool FROWMANRSUET EUX THU HIR EL ERE EB ll lis VCCI OETH O lt WDA A ee A CT COR CARER CE 9 2 CMR E5 SCF TEM HOES COB cdd AAG 7TH R Sit T SKS BORAENSTEMBVET DIABVCC DT 575 B VCCI ORR RHSI YAF Y a EUX TT a EEE 73 ABRAM TS INSOMMICIA FRORARNZY LET EUX AEREA Elias VCCI D PER HS lt 2 7 A re RE CT CORB ARE CHATSCEZAN EL THIET DRIBMNIVARPTV EV aL Sa RICE CHA NIC VMS Se CTTEMHORS WMHS Ichi TE LWY OR UNEU C lt ZIN iv Sun StorEdge T3 Array Field Service Manual November 2002 BSMI Class A Notice The following statement is applicable to products shipped to Taiwan and marked as Class A on the product compliance label SIE CLAMNBHER ARGAN R AQ RA FE Mae RE CRU RLS 0 SAGA ER DRWY FTSE T ARA He DAC QA DRI TE L RE eT YL SWS vi Sun StorEdge T3 Array Field Service Manual November 2002 Safety Agency Compliance Statements Read this section before beginning any procedure The following text provides safety precautions to follow when installing a Sun Microsystems product Safety Precautions For your protection observe the following safety precautions when setting up your equipment Follow all cautions and instructi
25. Environment Message Monitoring on page 108 for more detail 3 A third indication of a problem may be a message or change of status in the Component Manager maintenance program GUI display for example a suspect FRU highlighted in red Component Manager also sends e mail to whomever the customer specifies and logs the failure into a customer designated log file on the host that Component Manager is running on See Example syslog Error Messages on page 109 for more details 4 A fourth indication of a problem may be a warning or error log entry in the local array syslog file Examine this file by using CLI commands via a Telnet or Tip connection This file can also be transferred via ftp to another host for examination and archiving See Manual Examination of the syslog File on page 108 and Example syslog Error Messages on page 109 for more details 5 Additional indications of an FC AL loop problem can provided by running the CLI commands described in Normal Status on page 96 See Using CLI Diagnostic Commands on page 110 for more detail If after this information has been gathered and examined and it has been determined that one of the back end FC AL loops has failed but no definitive FRU an be identified perform one or more of the diagnostic procedures described in the following sections 106 Sun StorEdge T3 Array Field Service Manual November 2002 Checking Performance Against Baseline Data If the cu
26. LOOK et qui en outre se conforment aux licences crites de Sun LA DOCUMENTATION EST FOURNIE EN L ETAT ET TOUTES AUTRES CONDITIONS DECLARATIONS ET GARANTIES EXPRESSES OU TACITES SONT FORMELLEMENT EXCLUES DANS LA MESURE AUTORISEE PAR LA LOI APPLICABLE Y COMPRIS NOTAMMENT TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE A UAPTITUDE A UNE UTILISATION PARTICULIERE OU A L ABSENCE DE CONTREFA ON DI Please A O Recycle es Adobe PostScript Regulatory Compliance Statements Your Sun product is marked to indicate its compliance class e Federal Communications Commission FCC USA e Industry Canada Equipment Standard for Digital Equipment ICES 003 Canada e Voluntary Control Council for Interference VCCI Japan e Bureau of Standards Metrology and Inspection BSMI Taiwan Please read the appropriate section that corresponds to the marking on your Sun product before attempting to install the product FCC Class A Notice This device complies with Part 15 of the FCC Rules Operation is subject to the following two conditions 1 This device may not cause harmful interference 2 This device must accept any interference received including interference that may cause undesired operation Note This equipment has been tested and found to comply with the limits for a Class A digital device pursuant to Part 15 of the FCC Rules These limits are designed to provide reasonable protection against harmful interfer
27. PCU Note The battery is attached to the bottom panel of the PCU When removing the bottom panel do not attempt to remove it completely as the battery is still connected to the unit FIGURE7 5 Removing the Screws from the PCU Bottom Panel Chapter 7 Power and Cooling Unit Assemblies 91 4 Slide the bottom panel off the unit slightly enough to expose the battery connector as shown in FIGURE 7 6 and FIGURE 7 7 FIGURE 7 6 Lifting the PCU Bottom Panel and Battery Slightly Away from the Unit Sun StorEdge T3 Array Field Service Manual November 2002 92 FIGURE 7 7 The Battery Connector Details Inside the PCU 5 Remove the battery connector by pulling on if firmly straight out from the connector inside the PCU 6 Lift the bottom panel with the battery away from the unit and set it aside as shown in FIGURE 7 8 93 Chapter 7 Power and Cooling Unit Assemblies FIGURE 7 8 UPS Battery Setting Right Side Up Replace the UPS Battery 1 Connect the replacement battery to the battery connector of the PCU See FIGURE 7 7 Firmly push the connector all the way into the PCU battery connector There is no indication such as a mechanical click that indicates that it is fully inserted 2 Seat the battery pack in the PCU such that the bottom panel is flush with the edges of the PCU See FIGURE 7 5 3 Replace the four Phillips screws and secure the bottom panel to the PCU 4 Replace the PCU i
28. Precautions on page 5 64 Sun StorEdge T3 Array Field Service Manual November 2002 2 Remove the front panel by pressing in on the side latches and pulling the cover forward See FIGURE 5 2 EY CS A SF RNN f ANY 4 2 Latch FIGURE 5 2 Removing the Front Panel 3 Locate the disk drive that needs to be replaced Disk drives are numbered from 1 to 9 starting on the left side of the array Uu Uu UU UU UU UU Y l Ai Y l A Ss Y SSS SI SS ESSE Y Y Disk 1 FIGURE 5 3 Disk Drive Numbering 4 Use a coin or small screwdriver to press in and release the drive latch handle Chapter 5 Disks and Drives Disk 9 FIGURE 5 4 Releasing the Latch Handle 5 Use the latch handle to slowly pull the disk drive out 1 inch 2 5 cm Wait 30 seconds and then pull the drive out completely This gives the disk drive time to spin down 6 Remove the disk drive from the array See FIGURE 5 5 Push in the latch handle on the removed disk drive to protect it from damage the Sun StorEdge T3 array and all attached arrays will automatically shut down Caution Any disk drive that is removed must be replaced within 30 minutes or and power off 66 Sun StorEdge T3 Array Field Service Manual November 2002 FIGURE 5 5 Removin
29. Reset logto and loglevel to the original values noted in Step 8 lt 17 gt set logto lt 18 gt set loglevel 3 Chassis Replacement Procedure If none of the above procedures resolve the problem the next repair action is replacement of the chassis backplane assembly A replacement part must be on site before beginning this procedure Before starting the customer must off load all the data that is contained in the array The array must then be removed from host operation The procedure for replacing a backplane is described in Replacing the Chassis Backplane Assembly on page 126 Once the backplane has been replaced and the previous FRUs installed the ofdg diagnostics need to be rerun If the problem persists replace the entire Sun StorEdge T3 array Chapter 8 Diagnosing and Correcting FC AL Loop Problems 123 124 Sun StorEdge T3 Array Field Service Manual November 2002 CHAPTER 9 Chassis Backplane Assembly This chapter describes how to replace the chassis backplane assembly and contains the following sections m Troubleshooting the Chassis Backplane Assembly on page 125 m Replacing the Chassis Backplane Assembly on page 126 Troubleshooting the Chassis Backplane Assembly The array chassis FRU rarely needs to be replaced However the chassis part number is available to replace the backplane and chassis if necessary These must be replaced together because they are factory
30. Service Manual November 2002 Controller Card FIGURE A 6 Controller Card TABLEA 5 Controller Card Item Part Number Description 1 F501 5710 T3 controller card 2 F375 0085 Interconnect card assembly 3 F370 3990 Empty chassis backplane assembly 4 F300 1454 Power supply and cooling unit Appendix A Illustrated Parts Breakdown 165 Drive Assembly FIGURE A 7 Drive Assembly TABLE A6 Drive Assembly Item Part Number Description 1 F540 4287 Drive assembly 18 GB 2 F540 4367 Drive assembly 36 GB 3 F370 3990 Empty chassis backplane assembly 166 Sun StorEdge T3 Array Field Service Manual November 2002 Cable and Interconnect Assemblies FIGURE A 8 Cables and Interconnects Appendix A Illustrated Parts Breakdown 167 TABLE A 7 Cable and Interconnect Assemblies Item Part Number 1 F530 2842 2 F530 2843 3 F180 1918 4 F537 1034 5 1 6 F537 1020 7 F370 3989 Description Interconnect cable short Interconnect cable long Locking power cord Fiber optic cable Sun StorEdge T3 array LC SFF to SC Shielded Ethernet cable category 5 Fiber optic cable Sun StorEdge T3 array MIA adapter 1 Found in F370 4119 02 Diagnostic Cable Kit 168 Sun StorEdge T3 Array Field Service Manual November 2002 APPENDIX B Sun StorEdge T3 Array System Defaults This appendix lists the Sun StorEdge T3 array defaults and is divided into the following sections Boot Defaults on pa
31. Storage Automated Diagnostic Environment Storage Automated Diagnostic Environment message monitoring the Sun StorEdge T3 array syslog and the FC AL connected host messages file Data from these sources is used to determine the most likely failed FRU within the Sun StorEdge T3 array system Overview The procedures in this chapter assume that the person servicing the eguipment has been trained on the product and that the reguired service manuals are available A serial maintenance cable kit must be available part number 370 4119 along with a terminal or host port connection Note In order to collect the information reguired to diagnose back end FC AL loop problems several of the engineering only dot commands must be used Only the status options of these dot commands are used 95 Diagnosing and correcting back end FC AL loop problems can take up to five steps 1 Determine that there has been a failure in the back end drive loop Diagnosing the problem requires that you analyze the collected data and make a determination of which is the most likely failed FRU from the data available This procedure is described in Diagnosing an FC AL Loop on page 105 Once you identify a suspected FRU use one or more of the following steps to isolate and then replace the failed FRU 2 Isolate replace and verify the interconnect cards and or the loop cable Interconnect cards sometimes referred to as unit interconnec
32. Sun Microsystems Inc The OPEN LOOK and Sun Graphical User Interface was developed by Sun Microsystems Inc for its users and licensees Sun acknowledges the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry Sun holds a non exclusive license from Xerox to the Xerox Graphical User Interface which license also covers Sun s licensees who implement OPEN LOOK GUIs and otherwise comply with Sun s written license agreements Federal Acquisitions Commercial Software Government Users Subject to Standard License Terms and Conditions DOCUMENTATION IS PROVIDED AS IS AND ALL EXPRESS OR IMPLIED CONDITIONS REPRESENTATIONS AND WARRANTIES INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY FITNESS FOR A PARTICULAR PURPOSE OR NON INFRINGEMENT ARE DISCLAIMED EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID Copyright 2002 Sun Microsystems Inc 4150 Network Circle Santa Clara CA 95054 Etats Unis Tous droits r serv s Ce produit ou document est distribu avec des licences qui en restreignent l utilisation la copie la distribution et la d compilation Aucune partie de ce produit ou document ne peut tre reproduite sous aucune forme par quelque moyen que ce soit sans l autorisation pr alable et crite de Sun et de ses bailleurs de licence s il y en a Le logiciel d tenu par des tiers et qui comprend la technologie rel
33. WARNING Resetting system NVRAM to default are you sure N y t300 lt 4 gt Note The set z command resets the set parameters where as boot w wipes out all the volumes and sys parameters Refer to the Sun StorEdge T3 Array Administrator s Manual for more detailed information on setting block size Caution The set z command resets the IP address of the units to 0 0 0 0 You will to reassign the IP address to the master unit after you cable the partner group together but before powering on as described in the next section 7 Power off both units a Type lt 4 gt shutdown Shutdown the system are you sure N y b Press the power button once on each power and cooling unit to turn the switch off Cabling a Partner Group After changing the array settings on the alternate master to the factory default and reverifying that both units run the same firmware levels you are ready to connect the arrays 1 Place the alternate master on top of the master unit 138 Sun StorEdge T3 Array Field Service Manual November 2002 m If the units are installed in a cabinet make sure that the alternate master is installed in the slot directly above the master unit If you need to change the position in the cabinet refer to the rackmount installation instructions in the Sun StorEdge T3 Array Installation and Configuration Manual m If the units are cabled to the hosts and power sources such that they
34. We o 1 o vuo ooLro ooo oo hi S onoo oowoo oooo osoo onoo oooo oooo onoo onoo oooo oooo 0 0 3 0 o In the above example if the normal iostat is used as a notification threshold the impacted iostat indicates that there might be a problem in the master ulctr controller in this redundant partner group Chapter 8 Diagnosing and Correcting FC AL Loop Problems 107 108 Storage Automated Diagnostic Environment Message Monitoring If Storage Automated Diagnostic Environment message monitoring is installed and running it sends email messages indicating problems For example in the case of the performance impact illustrated above the email might have the following data in it CODE EXAMPLE 8 10 Example Storage Automated Diagnostic Environment Message Monitoring Email Message Data Mar 07 18 33 22 T3a ISR1 Mar 07 18 33 22 T3a ISR1 Mar 07 18 33 22 T3a ISR1 Mar 07 18 33 22 T3a ISR1 Mar 07 18 33 22 T3a ISR1 Mar 07 18 33 22 T3a ISR1 uld9 SVD_PATH_FAILOVER path_id uld8 SVD_PATH_FAILOVER path_id uld7 SVD_PATH FAILOVER path_id uld6 SVD_PATH FAILOVER path_id uld5 SVD_PATH FAILOVER path_id uld4 SVD_PATH FAILOVER path_id oooooo 1 1 1 1 1 1 z zzzz S In this example this data was pulled by Storage Automated Diagnostic Environment message monitoring from the remote host log file that the array sent syslog entries to Stora
35. Write RAID Stripe Write RAID 5 RMW Insert RAID 5 Recon Insert Data Sink Recon Stripe 0x1000 Copy Recon drv lt gt stdby RAID 1 Recon RAID 0 Insert into disk Block Internal Stripe 0x2000 RAID 0 Write Data Init RAID 1 Write Data Init RAID 5 Write Data Init Flush Stripe 0x2020 RAID 0 Cache Flush RAID 1 Cache Flush RAID5 RMW Cache flush RAID 5 Recon Cache Flush RAID 5 Stripe flush Verify Stripe 0x4000 RAID 0 Read Verify RAID 1 Read Verify RAID 5 Read Verify 206 Sun StorEdge T3 Array Field Service Manual November 2002 SCSI Command Set A partial list of the SCSI commands available with the Sun StorEdge T3 array are given in TABLE C 11 For a complete list of the commands see http www t10 org lists op num htm TABLE C 11 SCSI Command Set Opcode Commands Supported Notes 0x08 READ 6 yes 0x28 READ 10 yes Ox0A WRITE 6 yes Ox2A WRITE 10 yes Ox2E WRITE AND VERIFY yes Ox2F VERIFY yes 0x00 TEST UNIT READY yes 0x0 REQUEST SENSE yes 0x07 REASSIGN BLOCKS no 0x12 INQUIRY yes 0x16 RESERVE 6 yes 0x56 RESERVE 10 yes 0x17 RELEASE 6 yes 0x57 RELEASE 10 yes 0x1B START STOP UNIT yes 0x25 READ CAPACITY yes 0x1D SEND DIAGNOSTIC yes Ox1A MODE SENSE 6 yes Ox5A MODE SENSE 10 yes 0x15 MODE SELECT 6 yes 0x55 MODE SELECT 10 yes OxA0 REPORT LUNS yes Ox5E PERSISTENT RESERVE IN yes Appendix C Sun StorEdge T3 Array Messages 207 TABLE C 11 SCSI Command Set Continued Opcode Commands Supp
36. a mismatch is detected m rate specifies the speed with 1 slowest and 8 fastest Note The vol command is not re entrant Other vol commands cannot run on the array or partner group until the vol verify operation has completed Note It is a good practice to run vol verify before recycling backup tapes to be sure the image is correct before over writing previous images Checking Drive Temperature Use the fru stat command on the array to check disk drive temperatures lt 43 gt fru stat CTLR STATUS STATE ROLE PARTNER TEMP ulctr ready enabled master u2ctr 21 0 u2ctr ready enabled alt master ulctr 30 25 62 Sun StorEdge T3 Array Field Service Manual November 2002 Note A warning message will appear in the array syslog file if a disk drive reaches 65 degrees C The array automatically starts spinning down an individual drive if the drive s temperature reaches 75 degrees C Disk Drive LEDs LEDs at the top of each disk drive indicate drive activity and status These LEDs appear in the front cover on the unit TABLE 5 2 lists the possible drive LED states and a description for each state Disk drive LEDs TTT FIGURE 5 1 Disk Drive LEDs Viewed Through Front Cover TABLE 5 2 Disk Drive LED Descriptions Drive Activity Drive Status
37. between fast_find and find is that fast_find does not attempt to drill down to a disk port that is detect and isolate down to a bad disk port while find will try using Type 1 and Type 2 algorithms The fast_find option assumes that the probability of loop failures caused by either a bad interconnect cable or loop card is much higher than the probability of loop failures caused by a bad disk port Therefore fast_find should be used before find to first weed out bad interconnect cables and loop cards then find should be used to weed out bad disk ports if problems still exist The ofdg find Option The find option provides a Go No Go Loop test If the loop test fails Loop Fault Diag is invoked to drill down and find the bad FRU s The find option uses two different Drill down algorithms in order to detect bad FRU s m Type 1 bypass one disk port at a time and test m Type 2 find any three disk ports that work then enable one disk port at a time and test Use Type 2 only if Type 1 is unsuccessful The Loop Fault Diag has the capability to detect and isolate down to a single disk port but depending on the system configuration can be time consuming 114 Sun StorEdge T3 Array Field Service Manual November 2002 Repair Procedures Begin by replacing the FRU that have the minimum impact to the customer s operation as shown in the following order 1 Interconnect Card Replacement Procedure on page 115 2 RAID Co
38. bin file header size 265e14 checksum be4ec46 start 20010 base 20000 This copies the firmware to the bootable reserved areas on the local disk 9 Set the bootmode back to auto If you forget this step the system will continue doing tftpboots lt 4 gt set bootmode auto 10 Reset the system lt 5 gt reset Reset the system are you sure N y 11 Reconnect the Ethernet cable to the alternate master Chapter 2 Connecting to the Sun StorEdge T3 Array 15 Configuring a Server for Remote Booting If a Sun StorEdge T3 array is unable to boot you can use tftboot to reload the firmware This reguires configuring a remote server To configure a remote server to tftp boot a Sun StorEdge T3 array follow these steps 1 In a user file system create a directory on the server called tftpboot boothost mkdir tftpboot 2 Set permissions to allow users read write access boothost chmod 777 tftpboot 3 Copy the Sun StorEdge T3 array boot code into the tftpboot directory boothost cp nbnnn bin tftpboot Where nbnnn bin is the current boot code file identification number For example nb101 bin 4 Verify that tftpboot nbnnn bin is readable boothost chmod 755 tftpboot nbnnn bin 5 Edit the etc inetd conf file and uncomment the t ftp line tftp dgram udp wait root usr sbin in tftpd in tftpd s tftpboot 6 Restart inetd boothost ps eaf grep i
39. cannot be placed in close proximity rearrange the cabling so that the units can be placed together 2 Make sure that the 100BASE T cables are connected to a network with the same management host Chapter 10 Hardware Reconfiguration 139 3 Connect the interconnect cables to the interconnect cards as shown in FIGURE 10 1 Make sure you connect the cables to the correct interconnect card connectors exactly as shown in the figure This cable connection determines the master and alternate master relationship Tighten the retaining screws The remaining connectors are reserved for possible future expansion units oOo oOo Alternate master controller unit LlLl oo ET emo amp e Ee CD Master controller unit FIGURE 10 1 Connecting the Interconnect Cables 140 Sun StorEdge T3 Array Field Service Manual November 2002 A fully cabled partner group is shown below 1S Rae Z Fee eT lt gt pu co w sss Master controller unit i po so FIGURE 10 2 Fully Cabled Partner Group Caution Do not power on the arrays yet You must configure a RARP server connected to the array with the
40. con el responsable de mantenimiento xiii o con un electricista cualificado si no esta seguro del sistema de alimentaci n el ctrica gue existe en su edificio f Caution Precauci n No todos los cables de alimentaci n el ctrica tienen la misma capacidad Los cables de tipo dom stico no est n provistos de protecciones contra sobrecargas y por tanto no son apropiados para su uso con computadores No utilice alargadores de tipo dom stico para conectar sus productos Sun i Caution Precauci n Con el producto Sun se proporciona un cable de alimentaci n con toma de tierra Para reducir el riesgo de descargas el ctricas con ctelo siempre a un enchufe con toma de tierra La siguiente advertencia se aplica solamente a equipos con un interruptor de encendido que tenga una posici n En espera encendido de este producto funciona exclusivamente como un dispositivo de puesta en espera Los enchufes de la fuente de alimentaci n est n dise ados para ser el elemento primario de desconexi n del equipo Debe desconectar TODOS los enchufes de alimentaci n del equipo antes de desconectar la alimentaci n El equipo debe instalarse cerca del enchufe de forma que este ltimo pueda ser facil y r pidamente accesible i Caution Precauci n El interruptor de Bateria de litio Caution Precauci n En las placas de A control del sistema hay una bateria de litio insertada en el reloj de tiempo
41. dependent noise level defined in DIN 45 635 Part 1000 must be 70Db A or less Caution Caution The workplace SELV Compliance Safety status of I O connections comply to SELV reguirements Power Cord Connection Caution Caution Sun products are y designed to work with single phase power Caution On Applies AC power to the system Depending on the type of power switch your device has one of the following symbols may be used systems having a grounded neutral conductor To reduce the risk of electric shock do not plug Sun products into any other type of power system Contact your facilities manager or a qualified electrician if you are not sure what type of power is supplied to your building vii A Caution Caution Not all power cords have the same current ratings Household extension cords do not have overload protection and are not meant for use with computer systems Do not use household extension cords with your Sun product Caution Caution Your Sun product is shipped with a grounding type three wire power cord To reduce the risk of electric shock always plug the cord into a grounded power outlet The following caution applies only to devices with a Standby power switch A Caution Caution The power switches of this product function as standby type devices only The power cords serve as the primary disconnect device for the system ALL
42. disk drive SEAGATE ST336704FSUN A726 3CD1HNHO u2d6 disk drive SEAGATE ST336704FSUN A726 3CD1HH4Z u2d7 disk drive SEAGATE ST336704FSUN A726 3CD1H92W u2d8 disk drive SEAGATE ST336704FSUN A726 3CD1HN9T u2d9 disk drive SEAGATE ST336704FSUN A726 3CD1HKOP ulll loop card SCI SJ 375 0085 01 5 02 Flash 1413 ull2 loop card SCI SJ 375 0085 01 5 02 Flash 2294 u2l1 loop card SCI SJ 375 0085 01 5 02 Flash 001415 u212 loop card SCI SJ 375 0085 01 5 02 Flash 002054 ulpcul power cooling unit TECTROL CAN 300 1454 01 0000 001787 ulpcu2 power cooling unit TECTROL CAN 300 1454 01 0000 001784 u2pcul power cooling unit TECTROL CAN 300 1454 01 0000 001544 u2pcu2 power cooling unit TECTROL CAN 300 1454 01 0000 001545 ulmpn mid plane SCI SJ 370 3990 01 0000 000953 u2mpn mid plane SCI SJ 370 3990 01 0000 000958 Chapter 10 Hardware Reconfiguration 153 lt 6 gt fru stat CTLR STATUS STATE ROLE PARTNER TEMP ulctr ready enabled master u2ctr 31 0 u2ctr ready enabled alt master ulctr 30 5 DISK STATUS STATE ROLE PORT1 PORT2 TEMP VOLUME uldl ready enabled data disk ready ready 30 voll uld2 ready enabled data disk ready ready 31 voll uld3 ready enabled data disk ready ready 30 voll uld4 ready enabled data disk ready ready 29 voll uld5 ready enabled data disk ready ready 29 voll uld6 ready enabled data disk ready ready 30 vol3 uld7 ready enabled data disk
43. disk drives are in an optimal state as follows a Use the fru stat command to confirm that all disks are ready and enabled b Use the vol stat command to confirm that all disks that are configured into volumes are in an optimal state reported as drive state 0 If either of these commands display drive issues correct problems before proceeding with the firmware download 4 Use the proc list command to verify that there are no volume operations in progress Allow a volume operation in progress to complete before proceeding with the firmware download 5 Use the refresh s command to verify that there are no battery refresh operations in progress Allow a battery refresh in progress to complete before proceeding with the firmware download 6 Unmount the array volume s from the host to ensure there is no host I O activity unmount t3 filesystem name 7 Unmount internal array volume s lt 1 gt vol unmount volume name 8 Install the firmware using the disk download command lt 2 gt disk download uld1 9 filename The filename is the file name of the disk drive firmware image that was transferred by FIP to the array in Step 1 Caution If the array is configured with different manufacturers types of disk drives the disk command can download firmware for only one manufacturers drive type at a time Verify that the download was successful using either the CLI 9 Use the fru list command to verify
44. during a Sun StorEdge T3 array boot cycle TABLEC 7 Firmware Status Boot Messages Status Explanation firmware status 0 ISP is waiting for configuration process to complete firmware status 1 ISP is waiting for ALPA assignment firmware status 2 ISP is waiting for port login firmware status 3 ISP is ready and optimal firmware status 4 ISP has lost loop synchronization firmware status 5 ISP has experienced an unrecoverable error firmware status 6 ISP re initialization firmware status 7 ISP is not participating in the loop If the firmware status given in either of these boot messages is not 3 a drive or other component in the array could be faulty The number of devices found is important when trying to determine the failing device For example if only half of the devices are found a loop card or loop cable could be faulty The following message is generated by the ISP that services the front end or host loop A status of 7 not participating does not necessarily indicate a problem The attached host might not be running and thus cannot respond to the Sun StorEdge T3 array Initializing host port u2pl ISP2200 firmware status 7 Sun StorEdge T3 Array Workgroup Configuration T3B 2 Starting POST POST end Starting T3B EP Release 2 01 2002 07 30 16 33 52 129 150 28 81 Copyright C 1997 2002 Sun Microsystems Inc All Rights Reserved Appendix C Sun StorEdge T3 Array Messages 195
45. except for 2 things you need to be aware of 1 Beginning with the 1 17 and 2 0 bootcode releases the target is now reported using the hex version of the 7 bit loop ID The SEL_ID column in the AL_PA chart 2 You will see initiators with very low numbers like EF and E8 These are fabric ports on a switch and or 3rd Party HBAs like JNI etc check your task now both FCCO and FCC2 events can have low initiator numbers Port Event Messages These are typically port login logout events Common on the backend when a LUN fails over and on the host side when a host reboots or the loop goes down for some reason This is a common host port sequence ISR1 1 N ulctr ISP2100 2 Received LIP f7 f7 async event FCCO 1 N ulctr Port event received on port 0 abort 0 id 123 FCCO 1 N ulctr Port event received on port 0 abort O id 124 ISR1 1 N ulctr ISP2100 2 Received LIP f7 f7 async event FCCO 1 N ulctr Port event received on port 0 abort 0 id 124 FCCO 1 N ulctr Port event received on port 0 abort 0 id 123 where id is the Initiator This Sun StorEdge T3 array is connected to a loop with 2 initiators A LIP is received on the host port on ul and the HBAs Initiators connected to that port logout and log back in You would see something similar if a switch port were reset but the id would be low on the chart an E8 for example 180 Sun StorEdge T3 Array Field Service Manual November 2002 Identif
46. for each loop card HBIT Heartbeat Task LPCT Loop card monitor task CFGT configuration task WXET WriteTransferTask waits for command set completion 5X01 StartTransferTask Waits for the first command set to complete for the stripe and the head of the stripe order list XFRT Waits for a command decompose it into stripes and sets each stripe to the stripe requestor task MXFT Mirror transfer task HS01 Simulates host I Os to configured volumes SMON Handles events which effect cache mirroring FCCO0 ScsiPortCmdTask Port task to handle host commands FCC2 ScsiPortCmdTask Backend loop mirror task SIMT Brings ISP back online part of sim reset AppendixC Sun StorEdge T3 Array Messages 201 SVDT Handles backend loop link events such as LIPs loop up loop down etc SVHT Handles front end loop link events such as LIPs loop up loop down etc SDFT Handles path and loop failover events ONDG Executes back end loop diagnostics TMON Monitors disk temperature IPCS For multi controller inter processor communication IPCR Partner to IPCS LT00 Handles long transfer command execution LNXT Handles long non xfr command execution These are commands that take a long time like Reconstruct MNXT Handles medium non transfer command execution SNXT Handles short non transfer command execution SCHD Schedule manager Ftpd FIP daemon ANNT Wait for announce string and display is syslog daemon
47. fully powered on start a Telnet session The Telnet session will connect to the top unit If the host cannot telnet to the array investigate the following other possible causes m RARP server not responding To determine if this is the problem a Verify that the RARP daemon is running on the host system a Verify that the etc nsswitch conf file is properly configured on the RARP server a Inthe Solaris environment use the snoop command to verify that the array is attempting to establish RARP communication with the Solaris server m MAC address is incorrect In the Solaris environment use the snoop command to specify the MAC address of the array and to determine if any RARP packets are transmitted If you observe no transmissions during a reboot of the array verify that the MAC address on the array label matches the MAC address configured on the RARP server m Netmask is incorrect The default netmask address used on the array is 255 255 255 0 If the local subnet uses a different netmask the RARP operation might not work m Inoperable network connections If using hubs to connect to the network try eliminating or replacing the hub m Incorrect IP address Connect to the array through the serial port and verify that the IP address is correct Identifying Data Channel Failures The data channel encompasses the host data path that extends from the host bus adapter to the media interface adapter MIA attached to the array Errors
48. host has been configured to use the MAC address of the bottom unit this alternate configuration can cause the units to malfunction If the bottom unit is incorrectly cabled making the bottom unit the alternate master the bottom unit s Ethernet port will be inactive unless a fail over situation occurs In that event the IP and MAC address of the bottom unit will take over the values of the master top unit If the partner group has been cabled together incorrectly the following procedure can help determine if the top unit is acting as the master controller 36 Sun StorEdge T3 Array Field Service Manual November 2002 1 Determine the MAC address of the top unit The MAC address is located on a pull out tab at the front of the unit to the left of the first disk drive FIGURE 3 4 Pull out tab FIGURE 3 4 MAC Address on the Pull Out Tab 2 Edit the files on the RARP server to include the MAC address of the top unit a Edit the etc ethers file by adding the MAC address and array name For example 8 0 20 7d 93 7e array name In this example m 8 0 20 7d 93 7e is the MAC address a array name is the name of the master controller unit b Edit the etc hosts file with the IP address and array name For example 123 123 123 111 array name In this example 123 123 123 111 is the assigned IP address Chapter 3 Diagnosing T3 Array Problems 37 c Edit the etc nsswitch conf file to reference the lo
49. la ejecuci n de procedimientos distintos a los agu especificados pueden exponer al usuario a radiaciones peligrosas GOST R Certification Mark Nordic Lithium Battery Cautions Norge N Sverige N Caution A D V A R S E L Litiumbatteri Eksplosjonsfare Ved utskifting benyttes kun batteri som anbefalt av apparatfabrikanten Brukt batteri returneres apparatleverandoren Caution VARNING Explosionsfara vid felaktigt batteribyte Anv nd samma batterityp eller en ekvivalent typ som rekommenderas av apparattillverkaren Kassera anv nt batteri enligt fabrikantens instruktion Danmark N Caution ADVARSEL Litiumbatteri Eksplosionsfare ved fejlagtig h ndtering Udskiftning m kun ske med batteri af samme fabrikat og type Lev r det brugte batteri tilbage til leverandoren Suomi Caution VAROITUS Paristo voi r j ht jos se on virheellisesti asennettu Vaihda paristo ainoastaan laitevalmistajan suosittelemaan tyyppiin H vit kaytetty paristo valmistajan ohjeiden mukaisesti XV xvi Sun StorEdge T3 Array Field Service Manual November 2002 Contents Preface xxvii Troubleshooting Overview 1 Network Storage Overview 1 Maintenance Precaution 2 Error Messages and Logs 2 Sun StorEdge T3 Array Generated Messages 2 Host Generated Message 2 Sun Storage Automated Diagnostic Environment 4 Static Electricity Precautions 5 Connecting to the
50. level is listed as Release 2 1 136 Sun StorEdge T3 Array Field Service Manual November 2002 b Type fru list to display EPROM disk drive and interconnect card firmware levels For example lt 2 gt fru list ID TYPE VENDOR MODEL REVISION SERIAL ulctr controller card 0301 501 5710 02 020100 020101 112035 u2ctr controller card 0301 501 5710 02 020100 020101 112122 uldl disk drive SEAGATE ST336704FSUN A726 3CD1HMKJ uld2 disk drive SEAGATE ST336704FSUN A726 3CD1HH2A uld3 disk drive SEAGATE ST336704FSUN A726 3CD1H9WS uld4 disk drive SEAGATE ST336704FSUN A726 3CD1HM64 uld5 disk drive SEAGATE ST336704FSUN A726 3CD1HMC2 uld6 disk drive SEAGATE ST336704FSUN A726 3CD1HM63 uld7 disk drive SEAGATE ST336704FSUN A726 3CD1HE3A uld8 disk drive SEAGATE ST336704FSUN A726 3CD1HNKO uld9 disk drive SEAGATE ST336704FSUN A726 3CD1HM5P u2d1 disk drive SEAGATE ST336704FSUN A726 3CD1HHH5 u2d2 disk drive SEAGATE ST336704FSUN A726 3CD1HMJC u2d3 disk drive SEAGATE ST336704FSUN A726 3CD1HGKR u2d4 disk drive SEAGATE ST336704FSUN A726 3CD1HLBJ u2d5 disk drive SEAGATE ST336704FSUN A726 3CD1HNHO u2d6 disk drive SEAGATE ST336704FSUN A726 3CD1HH4Z u2d7 disk drive SEAGATE ST336704FSUN A726 3CD1H92W u2d8 disk drive SEAGATE ST336704FSUN A726 3CD1HN9T u2d
51. normal and expected behavior 190 Sun StorEdge T3 Array Field Service Manual November 2002 Reset Log Message Types If the set command is used with the loglevel parameter to set the notification level to 2 warning and error messages or higher 3 or 4 you can trace the reason for the reset be examining the contents of the syslog file This is possible because the reset log information is downloaded into the syslog file every time the system resets If desired the reset log information can also be downloaded whenever the logger dmprstlog command is issued TABLEC 4 Reset Log Message Types Index Type Type Value Description 0 Hardware 0x1000 User reset 1 Exception 0x2000 Exception 2 Assertion 0x3000 Software assertion 3 RAIDFail 0x4000 RAID fatal error 4 Takeover 0x5000 Takeover 5 pSOSFail 0x6000 pSOS fatal error 6 SysFail 0x7000 System error Type the following to capture the log t3 lt n gt logger dmprstlog Type the following to clear the log t3 lt n gt logger clrrstlog Appendix C Sun StorEdge T3 Array Messages 191 Reset Log Messages TABLEC 5 Reset Log Messages Type Mask Description RESET_FAIL 1000 Hardware Reset EXCEPT_FAIL 2000 2003 Data access exception 2004 Instruction access exception 2005 Alignment exception operand not word aligned 2008 Floating Point exception ASSERT_FAIL 3000 Software detected fault RAID_FAIL 4000 si SNXF_IN 4001 Short non transfer
52. page 98 m vol mode see The vol mode Command on page 99 m port listmap see The port listmap Command on page 100 m loop stat see The loop stat Command on page 101 96 Sun StorEdge T3 Array Field Service Manual November 2002 m disk pathstat see The disk pathstat Command on page 101 m disk linkstat see The disk linkstat Command on page 103 The examples that follow show a Sun StorEdge T3 array in a redundant partner group configuration with no failed FRUs Chapter 8 Diagnosing and Correcting FC AL Loop Problems 97 The fru stat Command The fru stat command returns the current condition of both disk ports port 1 and port 2 as well as the status of the interconnect cards If there are loop problems this might indicate certain disk ports have a status other than ready or the loop cards with a status other than ready or enabled lt 43 gt fru stat CTLR STATUS STATE ROLE PARTNER TEMP ulctr ready enabled master u2ctr 304 5 u2ctr ready enabled alt master ulctr 30 0 DISK STATUS STATE ROLE PORTI PORT2 TEMP VOLUME uldl ready enabled data disk ready ready 29 voll uld2 ready enabled data disk ready ready 31 voll uld3 ready enabled data disk ready ready 30 voll uld4 ready enabled data disk ready ready 29 voll uld5 ready enabled data disk ready ready 28 voll uld6 ready enabled data dis
53. power cords must be disconnected to remove power from the product Be sure to plug the power cords into a grounded power outlet that is nearby the system and is readily accessible Lithium Battery N viii Caution Caution On the system control board there is a lithium battery molded into the real time clock SGS No MK48T59Y MK48TXXB XX MK48T18 XXXPCZ M48T59W XXXPCZ M4T28 XXYYSHZ or MK48T08 Batteries are not customer replaceable parts They may explode if mishandled Do not dispose of the battery in fire Do not disassemble it or attempt to recharge it Battery Pack N Caution Caution There is a Nickel Metal Hydride battery in the product power supply Panasonic Model HHR200SCP There is danger of explosion if the battery is mishandled or incorrectly replaced Replace only with the same type of Sun Microsystems battery Do not disassemble it or attempt to recharge it outside the system Do not dispose of the battery in fire Dispose of thebattery properly in accordance with local regulations System Unit Cover N Caution Caution Do not operate Sun products without the top cover in place Failure to take this precaution may result in personal injury and system damage Laser Compliance Notice Sun products that use laser technology comply with Class 1 laser reguirements Sun StorEdge T3 Array Field Service Manual November 2002 Class 1 Laser Product Luokan 1 Laserlaite
54. ready ready 34 vol3 uld8 ready enabled data disk ready ready 37 vol3 uld9 ready enabled data disk ready ready 32 vol3 u2dl ready enabled data disk ready ready 34 vol2 u2d2 ready enabled data disk ready ready 38 vol2 u2d3 ready enabled data disk ready ready 36 vol2 u2d4 ready enabled data disk ready ready 37 vol2 u2d5 ready enabled data disk ready ready 34 vol2 u2d6 ready enabled data disk ready ready 36 vol4 u2d7 ready enabled data disk ready ready 35 vol4 u2d8 ready enabled data disk ready ready 40 vol4 u2d9 ready enabled data disk ready ready 36 vol4 LOOP STATUS STATE MODE CABLE1 CABLE2 TEMP u211 ready enabled master installed 29 u212 ready enabled slave installed 31 0 ulll ready enabled master installed 29 5 ull2 ready enabled slave installed 30 0 POWER STATUS STATE SOURCE OUTPUT BATTERY TEMP FAN1 FAN2 ulpcul ready enabled line normal normal normal normal normal ulpcu2 ready enabled line normal normal normal normal normal u2pcul ready enabled line normal normal normal normal normal u2pcu2 ready enabled line normal normal normal normal normal If the array reports a ready status with functional FRUs you can now restore the data if necessary and return the array to operation as a single controller unit 154 Sun StorEdge T3 Array Field Service Manual November 2002 Alternate Master Unit to a Single Controller Unit The former alternate master unit might be operating on an outdated file system If you apply a firmware p
55. real tipo SGS Num MK48T59Y MK48TXXB XX MK48T18 XXXPCZ M48T59W XXXPCZ M4T28 XXYYSHZ o MK48T08 El usuario no debe reemplazar las baterfas por si mismo Pueden explotar si se manipulan de forma err nea No arroje las baterfas al fuego No las abra o intente recargarlas Paquete de pilas hidruro met lico de niquel en el sistema de alimentaci n de la unidad Panasonic modelo HHR200SCP Existe riesgo de estallido si el paquete de pilas se maneja sin cuidado o se sustituye de manera indebida Las pilas s lo deben sustituirse por el mismo tipo de pilas de Sun Microsystems No las desmonte ni intente recargarlas fuera del sistema No arroje las pilas al fuego Des chelas siguiendo el m todo indicado por las disposiciones vigentes Caution Precauci n Existe una pila de Tapa de la unidad del sistema funcionar los productos Sun sin la tapa superior colocada El hecho de no tener en cuenta esta precauci n puede ocasionar dafios personales o perjudicar el funcionamiento del equipo Caution Precauci n Es peligroso hacer Aviso de cumplimiento con requisitos de laser Los productos Sun que utilizan la tecnologia de l ser cumplen con los requisitos de laser de Clase 1 Class 1 Laser Product Luokan 1 Laserlaite Klasse 1 Laser Apparat Laser Klasse 1 xiv Sun StorEdge T3 Array Field Service Manual November 2002 N Caution Precauci n El manejo de los controles los ajustes o
56. status for this failure The loop stat command would show a normal status for this failure The disk pathstat command would show a normal status for this failure The disk linkstat command would show the following error conditions for this failure CODE EXAMPLE 8 12 Example disk linkstat Error Data disk linkstat uld1 9 path 0 master controller DISK LINKFAIL LOSSSYNC LOSSSIG PROTOERR INVTXWORD INVCRC uldl Disk Link Status Failed uld2 Disk Link Status Failed uld3 Disk Link Status Failed uld4 Disk Link Status Failed uld5 Disk Link Status Failed uld6 Disk Link Status Failed uld7 Disk Link Status Failed uld8 Disk Link Status Failed uld9 Disk Link Status Failed fail When the disk linkstat command is run from the master controller it is unable to access any of the link registers for drives u1d1 9 This supports the conclusion that loop 1 path 0 has had a failure Once a suspect loop has been determined use a process of elimination to locate the failed FRU on that loop as described in the following sections 110 Sun StorEdge T3 Array Field Service Manual November 2002 Using the ofdg Diagnostic Utility If the problem is still unresolved the last diagnostic tool to use is the off line drive diagnostic utility ofdg Because the ofdg diagnostic requires that the T3 partner group be removed from host access it is a highly disruptive procedure stops all data access to the T3 partner grou
57. that are probably going to be most useful in field based diagnosis are the Target the drive the LUN and the OP code which will generate either a response or the actual OP code text itself ITL Message Examples Host Port Message FCCO 1 N ulctr ITL 7D 5 1 IT 20 TID 9CA8 OP 0 Target in Unit Attention where m FCCO task on external loop 3 host loop m 1 enclosure_id 1 ul a 7D initiator 7D alpa x01 HBA on host m 5 target 5 alpa xEF a 1 LUN1 This is a very common message Seen during the Sun StorEdge T3 array boot sequence or as the result of host activity such as a reboot and luxadm inquiry The initiator is sending a SCSI command to LUN 1 on controller 1 Back End Loop Message FCC2 1 N ulctr ITL 1 0 1 IT 20 TID AAE8 OP 0 Target in Unit Attention where m FCC2 task on loop 2 m 1 enclosure_id 1 ul a 1 initiator 1 alpa xE8 ISP chip on ctrl u2 m 0 target O alpa xEF ISP chip on ctrl ul a 1 LUN1 Appendix C Sun StorEdge T3 Array Messages 179 In this case ctrl2 u2 sent a Test Unit Ready cmd OP 0 through loop 2 and ul responds with Unit Attention u2 is checking on the status of the cache mirroring LUN We know this because it is task FCC2 and an initiator on a host side loop would have one of the standard initiator AL_PAs like 7C or 7D like the example above Interpreting ITL Messages in a Fabric SAN Environment Everything is the same as the FCAL environment
58. that the firmware download was successful The current drive firmware level is displayed in the fru list output 72 Sun StorEdge T3 Array Field Service Manual November 2002 10 Use the reset command to reboot the Sun StorEdge T3 array after all drives have been upgraded 11 After the array is back online log in to the array and verify that all FRU states are optimal as follows a Use the fru stat command to confirm that all drives are ready and enabled b Use the fru list command to display the current drive model number and firmware version c Use the vol stat command to display drive states All drives must report a drive state of 0 for optimal condition 12 Remount the volume s on the array lt 4 gt vol mount volume name s Chapter 5 Disks and Drives 73 74 Sun StorEdge T3 Array Field Service Manual November 2002 CHAPTER 6 Interconnect Card Assemblies This chapter describes how to monitor and replace the interconnect card and upgrade firmware The chapter contains the following sections m Interconnect Card LEDs on page 76 m Removing and Replacing an Interconnect Card on page 77 m Upgrading Interconnect Card Firmware on page 79 75 Interconnect Card LEDs Each of the interconnect cards has a status LED for each interconnect cable TABLE 6 1 lists the possible interconnect card status LED states with descriptions of each state N Interco
59. the Sun StorEdge T3 Array Installation and Configuration Manual Chapter 6 Interconnect Card Assemblies 79 80 Sun StorEdge T3 Array Field Service Manual November 2002 CHAPTER 7 Power and Cooling Unit Assemblies This chapter describes how to replace the power and cooling unit and monitor the UPS battery The chapter contains the following sections Power and Cooling Unit on page 81 Power and Cooling Unit LEDs on page 83 Removing and Replacing a Power and Cooling Unit on page 85 UPS Battery on page 87 Power and Cooling Unit The power and cooling unit has two active power sources standby and primary power Standby power which is used to power the micro controller on the interconnect card is activated when AC power is present Primary power which is used to power all remaining circuits and disk drives is activated when AC or battery power is present and the power switch is on Each power and cooling unit has a power switch in the rear upper center of the unit Turning off the power on a power and cooling unit affects only that power and cooling unit Therefore to power off all primary power to the unit both power switches on both power and cooling units must be turned off After the switches are turned off system primary power will not actually turn off until the controller has performed an orderly shutdown including writing any data cache to disk This process can take up to two minutes
60. the following sections Preparing the arrays on page 136 Cabling a Partner Group on page 138 Establishing a New IP Address on page 141 a a a m Defining and Mounting Volumes on the Alternate Master on page 144 135 Preparing the arrays 1 Decide which unit is the master controller and which is the alternate master 2 Back up the data on both arrays Caution Make sure you back up data on both units before proceeding You need to re create the volume s on the alternate master after cabling the units together 3 Ensure that the data path between the host and both arrays has been quiesced There must not be any I O activity 4 Start a Telnet session with both arrays a On the host use the telnet command with the array name or IP address to connect to the array telnet array_name Trying 129 150 47 101 Connected to 129 150 47 101 Escape character is Telnet session 129 150 47 101 b Log in to the array by typing root and your password at the prompts The array prompt is displayed 5 Verify that firmware levels for all array firmware are the same on the master unit and alternate master unit On both arrays a Type ver to display the controller firmware level For example lt 1 gt ver T3B Release 2 1 2002 07 30 19 16 42 10 4 35 134 Copyright C 1997 2001 Sun Microsystems Inc All Rights Reserved In this example the controller firmware
61. ulctr SysFail Reset 7001 was initiated at 20010626 163740 Cache memory parity error detected Assertion Reset 14 47 16 sh05 1 W ulctr Assertion Reset 3000 was initiated at 20020308 213140 common msc sxf_task c line 763 Assert err 0 gt 0 BOOT 14 47 16 sh05 1 N CPU state 14 47 16 sh05 1 N RO 000c9ea4 019cf510 002936bc 00000001 00000002 019cf3d0 016408e0 00000001 14 47 16 sh05 1 N R8 00000001 000000c8 000000c8 004d0000 004cd1a0 00294dec 00000000 00000000 14 47 16 sh05 1 N R16 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 14 47 16 sh05 1 N R24 0027ad48 0027a900 00000000 00409ef4 00000000 00000000 008fb408 008fb048 14 47 16 sh05 1 N CR 40000000 XER 00000000 LR 000c9eec CTR 00000000 DSISR 00000000 14 47 16 sh05 1 N DAR 00000000 MSR 0000b930 IP SRR0 001888ec SRR1 0000b930 Exception Reset 19 31 53 pshc 1 W ulctr Exception Reset 2004 was initiated at 20010904 192859 Instruction Access exception 19 31 53 pshc 1 N CPU state Appendix C Sun StorEdge T3 Array Messages 189 19 31 53 pshc 1 N RO 0008640 018b57a8 002936bc 00000019 01870000 0164dfe8 018b5e4c 001b6d2c 19 31 53 pshc 1 N R8 0000b930 0164dfe8 01640d04 004d0000 004cd1a0 00294dec 00000000 00000000 19 31 53 pshc 1 N R16 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 19 31 53 pshc 1 N R24 00000000 00000000 00000000 00000000 00000000 000
62. user can override the automatic selection by typing 1 or 2 within 1 5 seconds after T3B is displayed on the console If the command fails in the middle of update it will be an invalid level 2 code and level 1 code will not select the invalid level 2 code for booting If a bad level 2 code is programmed into ROM successfully then the user can manually select which copy to boot in order to work around the bad level 2 code If this happens it is better to update the level 2 code again in order to override the bad level 2 code copy Level 2 code has a size limitation of 384 Kbytes During boot up the level 2 code occupies RAM space starting at 0x500000 and the level 3 code is loaded by the level 2 code Currently the starting location of level 3 code is fixed at 0x20000 Although level 3 code can start at another location the code space after upload to RAM cannot go over 0x500000 The network of level 2 code will be enabled only when the bootmode is set to tftp Thus the ep command will only work when bootmode is tftp The level 2 code also includes POST Power On Self Test code in the booting process Chapter 4 Controller Card Assembly 53 Third Level Boot Code The third level boot code is the RAID application The code has assumed that level 2 code would have set up the MPC 107 and cleared the RAM if it is cold boot There are two copies of level 3 code in ROM one in 0xFF800000 0xFFB7FFFF the other in 0xFFB800
63. 00 0xFFF00000 EPROM and tftp Download File The file to be downloaded into ROM or through tftp must have specific header information with a structure such as listed below typedef struct ep_header_struct init code_size codesize init code_cksum codechecksum init code_start codestart init code_base codebase init code_signature codesignature init code_rev coderevision init code_subrev codesubrevision init code_date codedate init code_time codetime init hdr_counter codecounter init code_flags codeflags init reserved init reserved 3 init hdr_cksum headerchecksum P_HEADER The file content must be the binary image to be loaded into ROM or RAM It cannot be an elf file a hex file or a srecord file 54 Sun StorEdge T3 Array Field Service Manual November 2002 The following explains each field in the header TABLE 4 2 Channel Active LED Descriptions Header Description code_size code_cksum code_start code_base code_signature code_rev code_subrev code_date code_time This is the size of the code without the header information This value must be a multiple of four The real file size should be code_size plus sizeof EP_HEADER The 32 bit checksum value of the code code_cksum sum of all 32 bit words in code OR OxXFFFFFFFF 1 The execution starting location For example after downloa
64. 08 a t moul e dans l horloge temps r el SGS Les batteries ne sont pas des pi ces rempla ables par le client Elles risquent d exploser en cas de mauvais traitement Ne pas jeter la batterie au feu Ne pas la d monter ni tenter de la recharger Bloc batterie Caution Attention l alimentation du produit contient une batterie nickel hydrure m tallique Panasonic mod le HHR200SCP Il existe un risque d explosion si cette batterie est manipul e de fa on erron e ou mal mise en place Ne remplacez cette batterie que par une batterie Sun Microsystems du m me type Ne la d montez pas et n essayez pas de la recharger hors du syst me Ne faites pas br ler la batterie mais mettez la au rebut conform ment aux r glementations locales en vigueur Couvercle Caution Attention il est dangereux de faire fonctionner un produit Sun sans le couvercle en place Si l on n glige cette pr caution on encourt des risques de blessures corporelles et de d g ts mat riels Conformit aux certifications Laser Les produits Sun qui font appel aux technologies lasers sont conformes aux normes de la classe 1 en la mati re Class 1 Laser Product Luokan 1 Laserlaite Klasse 1 Laser Apparat Laser Klasse 1 contr les de r glages ou de performances de proc dures autre que celle sp cifi e dans le pr sent document peut provoquer une exposition des radiations dangereuses Caution Attention
65. 1 gt reset help reset reset system reentrant not locked TABLE D 1 contains an alphabetical listing of the CLI commands supported by the array See the Sun StorEdge T3 Array Administrator s Manual for a detailed description of each command s syntax options and arguments TABLE D 1 Commands Listed in Alphabetical Order Command Description boot Boot system disable Disable certain FRUs disk Disk administration enable Enable certain FRUs ep Program the flash EPROM fru Display the FRU information help Display reference manual pages id Display FRU identification summary logger Generate messages to the syslog in the unit dump the reset log and display system crash information lpc Get interconnect card property port Configure the interface port proc Displays status of outstanding vol processes refresh Start stop battery refreshing or display its status reset Reset system set Display or modify the set information shutdown Shut down array or partner group sys Display or modify the system information ver Display software version vol Display or modify the volume information 216 Sun StorEdge T3 Array Field Service Manual November 2002 FRU Identifiers Many commands use a FRU identifier to refer to a particular FRU in an array This identifier contains a unit constant u the unit number encid the FRU constant ctr for controller card pcu for power and cooling unit 1 for interconnect card d f
66. 16 Off Line Drive Diagnostics and Replacement 117 Chassis Replacement Procedure 123 xx Sun StorEdge T3 Array Field Service Manual November 2002 10 Chassis Backplane Assembly 125 Troubleshooting the Chassis Backplane Assembly 125 Replacing the Chassis Backplane Assembly 126 Hardware Reconfiguration 135 Connecting Single Controller Units to Form a Partner Group 135 Preparing the arrays 136 Cabling a Partner Group 138 Establishing a New IP Address 141 Defining and Mounting Volumes on the Alternate Master 144 Disconnecting a Partner Group to Form Single Controller Units 149 Preparing the Arrays 149 Establishing a New IP Address 151 Establishing a Network Connection 152 Alternate Master Unit to a Single Controller Unit 155 Changing the Port ID on the Array 158 Illustrated Parts Breakdown 159 Sun StorEdge T3 Array 160 Sun StorEdge T3 Array Assemblies 161 Door Assembly 162 Interconnect Card Assembly 163 Power Supply and Cooling Unit 164 Controller Card 165 Drive Assembly 166 Cable and Interconnect Assemblies 167 Sun StorEdge T3 Array System Defaults 169 Boot Defaults 169 System Defaults 170 Contents xxi Volume Defaults 171 Default Directories and Files 172 C Sun StorEdge T3 Array Messages 173 Message Syntax 174 Miscellaneous Abbreviations 175 Interpreting Sun StorEdge T3 Array syslog Messages 176 Reset Log Message Types 191 Boot Messages 193 Task List 201 Internal Sun StorEdge T3 Array AL_PA LI
67. 2 loop card SCI SJ 375 0085 01 5 02 Flash 002054 ulpcul power cooling unit TECTROL CAN 300 1454 01 0000 001787 ulpcu2 power cooling unit TECTROL CAN 300 1454 01 0000 001784 u2pcul power cooling unit TECTROL CAN 300 1454 01 0000 001544 u2pcu2 power cooling unit TECTROL CAN 300 1454 01 0000 001545 ulmpn mid plane SCI SJ 370 3990 01 0000 000953 u2mpn mid plane SCI SJ 370 3990 01 0000 000958 Chapter 10 Hardware Reconfiguration 145 lt 2 gt fru stat CTLR STATUS STATE ROLE PARTNER TEMP ulctr ready enabled master u2ctr 31 0 u2ctr ready enabled alt master ulctr 30 5 DISK STATUS STATE ROLE PORT1 PORT2 TEMP VOLUME uldl ready enabled data disk ready ready 30 voll uld2 ready enabled data disk ready ready 31 voll uld3 ready enabled data disk ready ready 30 voll uld4 ready enabled data disk ready ready 29 voll uld5 ready enabled data disk ready ready 29 voll uld6 ready enabled data disk ready ready 29 vol3 uld7 ready enabled data disk ready ready 34 vol3 uld8 ready enabled data disk ready ready 37 vol3 uld9 ready enabled data disk ready ready 32 vol3 u2dl ready enabled data disk ready ready 34 vol2 u2d2 ready enabled data disk ready ready 38 vol2 u2d3 ready enabled data disk ready ready 36 vol2 u2d4 ready enabled data disk ready ready 37 vol2 u2d5 ready enabled data disk ready ready 34 vol2 u2d6 ready enabled data disk ready ready 36 v
68. 2 primary 100 Sun StorEdge T3 Array Field Service Manual November 2002 The loop stat Command The loop stat command returns the current loop configuration with regard to the electrical connections between the loop cards A loop configuration other than the example below might indicate loop problems Note The symbol represents the presence of the ISP2200 chip CODE EXAMPLE 8 3 loop stat Command Normal Ouput lt 4 gt loop stat Loop 1 lt 1 gt lt 2 gt Loop 2 lt 1 2 gt Where m lt 1 gt means uld1 9 and ulctr ISP2200 are on the loop m lt 2 gt means u2d1 9 are u2ctr ISP2200 are on the loop m lt 1 gt lt 2 gt means the loop is split into 2 segments m lt 1 2 gt means uld1 9 and u2d1 9 and ulctr and u2ctr ISP2200s are all on the loop m lt 1 2 gt means uld1 9 and u2d1 9 and ulctr ISP2200 are on the loop A disabled u2ctr would result in this configuration m lt 12 gt means uld1 9 and u2d1 9 and u2ctr ISP2x00 are on the loop A disabled ulctr could result in this configuration The disk pathstat Command The disk pathstat command returns the current disk path logical configuration A path status other than what is displayed below might indicate loop problems Chapter 8 Diagnosing and Correcting FC AL Loop Problems 101 Note The Telnet session always runs the command through the master controller CODE EXAMPLE 8 4 disk pathstat Comma
69. 20 TID 9684 OP 4D Invalid command opcode where m May 18 16 36 08 date and time m FCCO the task that generates the message m 1 ulctr the controller reporting the error m N message level m ulctr FRU identifier m ITL 7D 1 0 TT 20 TID 9684 OP 4D Invalid command opcode message text The first thing to look at is the task There is a list of tasks in the FAQ on the HES website and at the end of this document The most important information for a quick look at data path problems or LUN access problems is the task If you see FCCO you know immediately this is a host port issue and you probably have a front end Appendix C Sun StorEdge T3 Array Messages 177 loop problem FCC2 is the cache mirroring task These represent chatter between the controllers to monitor the status of each others cache mirror The FCC2 messages can be misleading since the cache mirror is actually seen as a LUN which means you get messages just like the one above But the LUN being queried is a virtual LUN These are typically seen right after a boot or when the cache is being flushed see explanation below There are four levels of messages listed in TABLE C 1 in order of severity E rror W arn N otice and I nformation Be careful to observe all E rrors These are critical events like FRU failures W arnings are important as well and could indicate a future problem N otices are freguent and voluminous Many are just
70. 26 6B 45 69 25 70 112 B9 1B 27 6A 46 70 23 71 113 B6 1C 28 69 47 71 1F 72 114 B5 1D 29 67 48 72 1E 73 115 B4 1E 30 66 49 73 1D 74 116 B3 1F 31 65 4A 74 1B 75 117 B2 20 32 63 4B 75 18 76 118 B1 21 33 5C 4C 76 17 77 119 AE 22 34 5A 4D 77 10 78 120 AD 23 35 59 4E 78 OF 79 121 AC 24 36 56 4F 79 08 7A 122 AB 25 37 55 50 80 04 7B 123 AA 26 38 54 51 81 02 7C 124 A9 27 39 53 52 82 01 7D 125 A7 28 40 52 53 83 00 7E 126 A6 29 41 51 54 84 7F 127 A5 2A 42 4E 55 85 Note The values are intentionally from lowest to highest priority AL_PA 00 is reserved for the FL_Port is not available Source ftp ftp tll org tll member fc al fcal44p asc Calculating Port and Loop ids port_local 3 x encl_id 1 port_loop To get the isp port on a ctrl where encl_id 1 2 8 port_loop 0 1 2 210 Sun StorEdge T3 Array Field Service Manual November 2002 loop_id encl_id 1 This is the isp_id gt alpa on each of the 2 back end isp s on a ctrl see chart at end of file Sense Key Explanations Sense keys are returned from devices when issued a REQUEST SENSE command They return more detailed information on a problem which occurred with a previous command Here are the definitions of Sense keys as defined in the SCSI 2 proposed standard 0xB ABORTED COMMAND This indicates that the target aborted the command The initiator may be able to recover by trying the command again 0x8 BLANK CHECK This indicates t
71. 8f640 00010400 000004c8 19 31 53 pshc 1 N CR 44000000 XER 00000000 LR 0008f650 CTR 00000000 DSISR 00000000 19 31 53 pshc 1 N DAR 00000000 MSR 00001030 IP SRRO deaddeac SRR1 4000b930 DATA LENGTH INCORRECT from bug id 4355112 04 58 17 FCC2 2 N u2ctr Port event received on port 5 abort 0 id 0 04 58 17 FCC2 2 N u2ctr ITL 0 1 0 TT 20 TID AB84 OP 2A Target in Unit Attention 04 58 17 FCC2 2 N u2ctr ITL 0 1 0 TT 20 TID AB90 OP 2A Aborted by Host 04 58 17 SX01 2 N u2ctr ITL 0 1 0 TT 20 TID AB90 OP 2A Data length incorrect 04 58 17 FCC2 2 N u2ctr lt lt Abort Task Set gt gt on port 5 abort 1 All Sun StorEdge T3 arrays LUNs have a Power On Unit Attention pending on each port for each initiator Therefore the back end cache mirroring LUN will receive this error condition for the first I O Since the SVD disk driver causes a force flush by issuing an Abort Task Set upon receiving a Unit Attention condition all outstanding cache mirroring LUN commands at the time of the Unit Attention condition is received will be aborted In addition potential Notice syslog messages may be generated due to the command prematurely getting aborted for example if the data length is incorrect Note Once this initial Unit Attention condition is cleared any subsequent Unit Attention conditions causing Abort Task Set to be generated during normal operation may be due to faulty hardware and is not deemed to be
72. 9 disk drive SEAGATE ST336704FSUN A726 3CD1HKOP ulll loop card SCI SJ 375 0085 01 5 02 Flash 1413 ull2 loop card SCI SJ 375 0085 01 5 02 Flash 2294 u2l1 loop card SCI SJ 375 0085 01 5 02 Flash 001415 u212 loop card SCI SJ 375 0085 01 5 02 Flash 002054 ulpcul power cooling unit TECTROL CAN 300 1454 01 0000 001787 ulpcu2 power cooling unit TECTROL CAN 300 1454 01 0000 001784 u2pcul power cooling unit TECTROL CAN 300 1454 01 0000 001544 u2pcu2 power cooling unit TECTROL CAN 300 1454 01 0000 001545 ulmpn mid plane SCI SJ 370 3990 01 0000 000953 u2mpn mid plane SCI SJ 370 3990 01 0000 000958 In this example a EPROM firmware version is listed as Controller card Revision 020100 020101 m Disk drive firmware version is listed as Revision A726 a Interconnect card loop card firmware version is listed as Revision 5 02 Flash c Upgrade firmware if necessary Chapter 10 Hardware Reconfiguration 137 m If the firmware levels are the same on each unit then proceed to Step 6 m If the firmware levels for any of the four types of firmware are different between the master and alternate master upgrade the firmware that does not match on both units Refer to the upgrading firmware instructions in the Sun StorEdge T3 Array Installation and Configuration Manual 6 On both units use the set z command to return critical array settings to factory defaults When prompted to respond answer y yes For example lt 3 gt set z
73. 990 F300 1454 F375 0085 1 2 3 4 F501 5710 5 6 F540 4287 7 F540 4367 Description Door assembly Empty chassis backplane assembly Power supply and cooling unit T3 controller card Interconnect card assembly Drive assembly 18 GB not shown in this view Drive assembly 36 GB not shown in this view Appendix A Illustrated Parts Breakdown 161 Door Assembly Sy f RAY EY B ANN _ ss ASY fa SO FIGURE A 3 Door Assembly TABLE A 2 Door Assembly Item Part Number Description 1 F540 4306 Door assembly 2 F370 3990 Empty chassis backplane assembly 3 F540 4287 Drive assembly 18 GB 4 F540 4367 Drive assembly 36 GB 162 Sun StorEdge T3 Array Field Service Manual November 2002 Interconnect Card Assembly FIGURE A 4 Interconnect Card Assembly TABLE A 3 Interconnect Card Assembly Item Part Number Description 1 F375 0085 Interconnect card assembly 2 F370 3990 Empty chassis backplane assembly 3 F300 1454 Power supply and cooling unit 4 F501 5710 T3 controller card Appendix A Illustrated Parts Breakdown 163 Power Supply and Cooling Unit FIGURE A 5 Power Supply TABLE A 4 Power Supply Item Part Number Description 1 F300 1454 Power supply and cooling unit 2 F375 0085 Interconnect card assembly 3 F370 3990 Empty chassis backplane assembly 4 F501 5710 T3 controller card 5 F370 3956 Battery pack NIMH 164 Sun StorEdge T3 Array Field
74. AC address Set by firmware System Defaults Specify system defaults with the sys command See the Sun StorEdge T3 Array Administrator s Manual for more information on using the sys command TABLE B 2 System Default Settings Sys Parameter Default Variables 16k 32k 64k blocksize cache mirror mp_support rd_ahead 64k auto auto none on auto writebehind writethrough off auto off rw none Multi pathing support Set to off to always perform datablock read ahead 170 Sun StorEdge T3 Array Field Service Manual November 2002 TABLE B 2 System Default Settings Sys Parameter Default Variables recon_rate medium _ high medium low Reconstruction rate memsize 32 Set by controller read only In MBytes cache memsize 256 Set by controller read only In MBytes Volume Defaults Specify system defaults with the vol command See the Sun StorEdge T3 Array Administrator s Manual for more information TABLEB 3 Volume Defaults Parameter Default Variables init rate n 16 1 16 1 is lowest 16 is highest verify rate n 1 1 8 Rate parameter refers to host interleave factor contention with host IOs Default is 1 There is currently no feature that spawns a vol verify process The default for the SCSI vendor ID field is Sun Display or change this value with the port command The default Sun StorEdge T3 array volume configuration as shipped from the fact
75. AID controller card can be removed without denying access to all data assuming appropriate multipathing software has been configured on the host While data accessibility is maintained during the replacement and testing of a single RAID controller performance is reduced during this procedure The customer might elect to schedule the repair action during a time of reduced operations to the Sun StorEdge T3 array system For the above example of a suspected loop 1 path 0 problem perform the following steps 1 From the CLI disable the u1 RAID controller card lt 1 gt disable ul This causes a controller failover to the other controller The Telnet session fails and the alternate controller becomes the master VERITAS if used redirects the host I O through the remaining path for the failed controller s volumes 2 When the u1 LED flashes amber remove and replace the u1 controller card See Removing and Replacing a Controller Card on page 49 3 After the controller boots verify the LED on ul interconnect card is a solid green 4 Restart a Telnet session to the array 5 It may be necessary to disable and then enable the controller with the CLI commands to return it to service For example lt 2 gt disable ul lt 3 gt enable ul 6 Verify that VERITAS if used completes a path fail back to the replaced controller Consult your VERITAS documentation for VERITAS diagnostic procedures 7 V
76. D LOOP Map 203 SCSI Virtual Disk Driver SVD Error Definitions 204 Stripe Type Messages 205 SCSI Command Set 207 Arbitrated Loop Physical Addresses AL_PA and Loop IDs 209 Sense Key Explanations 211 D Sun StorEdge T3 Array System Commands 215 Commands List 215 FRU Identifiers 217 E FC AL Loop Identifiers 219 F Sun StorEdge T3 Array Configuration Worksheets 221 Worksheets 221 System Information Worksheets 222 xxii Sun StorEdge T3 Array Field Service Manual November 2002 GURE 2 1 GURE 3 1 GURE 3 2 GURE 3 3 GURE 3 4 GURE 3 5 GURE 3 6 GURE 4 1 GURE 4 2 GURE 5 1 GURE 5 2 GURE 5 3 GURE 5 4 GURE 5 5 GURE 6 1 GURE 6 2 GURE 7 1 GURE 7 2 GURE 7 3 GURE 7 4 Figures Serial Port Location 8 Data Connection Troubleshooting Flow Chart 22 Ethernet Troubleshooting Flow Chart 23 Procedure A 24 MAC Address on the Pull Out Tab 37 Power Switch Locations 38 Single Host With Two Controller Units Configured as a Partner Group 41 Sun StorEdge T3 Array Controller Card LEDs 48 Removing the Controller Card 50 Disk Drive LEDs Viewed Through Front Cover 63 Removing the Front Panel 65 Disk Drive Numbering 65 Releasing the Latch Handle 66 Removing a Disk Drive 67 Interconnect Card LEDs 76 Removing the Interconnect Card 78 Power Cords Connected to the Power and Cooling Units 82 Power and Cooling Unit LEDs 83 Removing the Power and Cooling Unit 86 Turning the PCU upsid
77. IP address before powering on Establishing a New IP Address The JumpStart feature automatically downloads a newly assigned IP address to the array To enable this feature you must edit your host file on a RARP server before powering on the array After you power on the IP address is automatically assigned Before you begin make sure you have the following m MAC address The MAC address is located in the pullout tab at the front of the array FIGURE 10 3 Chapter 10 Hardware Reconfiguration 141 Pull out tab FIGURE 10 3 Location of Pull Out Tab With MAC Address m IP address For this information contact the person who maintains your network m Array name This is the user assigned name of the array To set the network IP address for the array 1 On a host connected to the same subnet as the array edit the etc ethers file by adding the MAC address and array name For example 8 0 20 7d 93 7e array name In this example m 8 0 20 7d 93 7e is the MAC address m array name is the name of the array you are installing 2 Edit the etc hosts file with the IP address and array name For example 192 129 122 111 array name In this example 192 129 122 111 is the assigned IP address 142 Sun StorEdge T3 Array Field Service Manual November 2002 3 Edit the etc nsswitch conf file to reference the local system files To ensure that the Solaris software environment uses the changes made to et
78. Klasse 1 Laser Apparat Laser Klasse 1 Caution Caution Use of controls adjustments or the performance of procedures other than those specified herein may result in hazardous radiation exposure Einhaltung sicherheitsbeh rdlicher Vorschriften Auf dieser Seite werden Sicherheitsrichtlinien beschrieben die bei der Installation von Sun Produkten zu beachten sind Sicherheitsvorkehrungen Treffen Sie zu Ihrem eigenen Schutz die folgenden Sicherheitsvorkehrungen wenn Sie Ihr Ger t installieren Beachten Sie alle auf den Ger ten angebrachten Warnhinweise und Anweisungen Vergewissern Sie sich dab Spannung und Freguenz Ihrer Stromguelle mit der Spannung und Freguenz iibereinstimmen die auf dem Etikett mit den elektrischen Nennwerten des Ger ts angegeben sind Stecken Sie auf keinen Fall irgendwelche Gegenst nde in Offnungen in den Ger ten Leitf hige Gegenst nde k nnten aufgrund der m glicherweise vorliegenden gefahrlichen Spannungen einen Kurzschlu8 verursachen der einen Brand Stromschlag oder Ger teschaden herbeifiihren kann Symbole Die Symbole in diesem Handbuch haben folgende Bedeutung Caution Achtung Gefahr von Verletzung und Ger teschaden Befolgen Sie die Anweisungen Caution Achtung Hohe Temperatur Nicht ber hren da Verletzungsgefahr durch hei e Oberflache besteht Spannungen Anweisungen befolgen um Caution Achtung Gef hrliche S
79. N A726 3CD1HGKR u2d4 disk drive SEAGATE ST336704FSUN A726 3CD1HLBJ u2d5 disk drive SEAGATE ST336704FSUN A726 3CD1HNHO u2d6 disk drive SEAGATE ST336704FSUN A726 3CD1HHAZ u2d7 disk drive SEAGATE ST336704FSUN A726 3CD1H92W u2d8 disk drive SEAGATE ST336704FSUN A726 3CD1HN9T u2d9 disk drive SEAGATE ST336704FSUN A726 3CD1HKOP ulll loop card SCI SJ 375 0085 01 5 02 Flash 1413 ull2 loop card SCI SJ 375 0085 01 5 02 Flash 2294 u2l1 loop card SCI SJ 375 0085 01 5 02 Flash 001415 u212 loop card SCI SJ 375 0085 01 5 02 Flash 002054 ulpcul power cooling unit TECTROL CAN 300 1454 01 0000 001787 ulpcu2 power cooling unit TECTROL CAN 300 1454 01 0000 001784 u2pcul power cooling unit TECTROL CAN 300 1454 01 0000 001544 u2pcu2 power cooling unit TECTROL CAN 300 1454 01 0000 001545 ulmpn mid plane SCI SJ 370 3990 01 0000 000953 u2mpn mid plane SCI SJ 370 3990 01 0000 000958 In this example m EPROM firmware version is Controller card Revision 020100 020101 m Disk drive firmware version is Revision A726 m Interconnect card loop card firmware version is Revision 5 02 Flash 34 Sun StorEdge T3 Array Field Service Manual November 2002 Checking FRU Status Use the fru stat command to provide a status of each FRU including temperatures lt 43 gt fru stat CTLR STATUS STATE ROLE PARTNER TEMP ulctr ready enabled master uze
80. SI Disk Error Occurred path 0x1 port u2ctr Disabled u2ctr Not present u2ctr Missing system shutting down in 30 u2ctr Not present u2d1 SCSI Disk Error Occurred path 0x1 Sense Data Description Logical Unit Not ed u2dl SCSI Disk Error Occurred path 0x1 Sense Key 0x2 Asc 0x4 Ascq 0x2 u2d1 Failed u2dl hard err in vol vol001 starting auto u2dl Recon attempt failed SCSI Disk Error Occurred path 0x1 port Sense Data Description SCSI Parity Error Sense Data Description SCSI Parity Error Chapter 8 Diagnosing and Correcting FC AL Loop Problems 109 Using CLI Diagnostic Commands Once the syslog file has been examined for warning or error messages and a conclusion is reached on which loop might have failed other CLI commands can be used to verify or support that conclusion These commands display the various status and current configuration of the loops Use a serial cable and Tip session to collect and analyze both controller s loop status information The serial cable is necessary to see the loop configuration for the alternate controller as the Telnet session only displays the current loop status as seen from the master controller For the example problem above the CLI commands produce these results The fru stat command would show a normal status for this failure The vol mode command would show a normal status for this failure The port listmap command would show a normal
81. Sun StorEdge T3 Array 7 Establishing a Serial Port Connection 7 Establishing a Telnet Session 9 Establishing an FTP Session 12 Using t ftpboot to Boot a Single Array or a Partner Group Remotely 13 Configuring a Server for Remote Booting 16 Setting Up Remote Logging 17 Diagnosing T3 Array Problems 19 Diagnostic Information Sources 19 xvii Troubleshooting Flow Charts 22 Initial Troubleshooting Guidelines 25 Troubleshooting Sources 25 Troubleshooting Checks 25 Verifying the Data Host Connection 26 Storage Automated Diagnostic Environment Link Test 27 Checking Array Boot Status 27 Telnet Connection Status Checks 30 Determining Failover 30 Verifying the Firmware Level and Configuration 32 Checking FRU Status 35 Testing the Array With Storage Automated Diagnostic Environment 36 Identifying Miscabled Partner Groups 36 Identifying Data Channel Failures 39 Reserved System Area Recovery Procedure 40 Recovery Procedure 40 Controller Card Assembly 47 Controller Card LEDs 47 Removing and Replacing a Controller Card 49 Upgrading Controller Firmware 51 Controller EPROM Firmware 51 Firmware Upgrade Discussion 52 Boot Code Explanation 52 Level 1 Controller Firmware 56 Level 2 Controller Firmware 57 Level 3 Controller Firmware 58 Sun StorEdge T3 Array Field Service Manual November 2002 Disks and Drives 59 Monitoring Drive Status 59 Checking Drive Status Codes 60 Checking the Hot Spare 61 Checking Data Parity 62 Checking Dr
82. Switch Setting Switch Setting hex dec hex hex dec hex dec 220 Sun StorEdge T3 Array Field Service Manual November 2002 APPENDIX F Sun StorEdge T3 Array Configuration Worksheets This chapter contains a blank worksheet for the qualified service provider to make notes at each customer site and contains the following sections m Worksheets on page 221 m System Information Worksheets on page 222 Worksheets The following information is reguired to successfully troubleshoot a Sun StorEdge T3 array Use this worksheet to access the data Ethernet and TFTP connections from the application management and TFTP host system s The application management and TFIP host can all be resident on the same server Supervisor access is required for all hosts during troubleshooting Host types are defined as the following Application host The application host utilizes the FC AL connection as a data path to and from the Sun StorEdge T3 array Management host The management host administers configuration and health monitoring of the Sun StorEdge T3 array through a network connection TFTP host The TFTP host is used to download bootcode to the Sun StorEdge T3 array through a network connection 221 System Information Worksheets The following information should be documented before troubleshooting any Sun StorEdge T3 array Make copies of this blank form and complete it fo
83. This chapter describes how to monitor and replace the controller card and how to upgrade the firmware The chapter contains the following sections m Controller Card LEDs on page 47 m Removing and Replacing a Controller Card on page 49 m Upgrading Controller Firmware on page 51 Controller Card LEDs This section describes the controller card LEDs for the Sun StorEdge T3 array 47 Sun StorEdge T3 Array Controller Card LEDs The Sun StorEdge T3 array controller card has two channel active LEDs one for the FC AL interface port and one for the Ethernet port and a controller status online LED TABLE 4 1 lists the possible states of the controller card LEDs and describes each state Tk i N a L FC AL active LED Ethernet 100BASE T Controller online active LED active LED status LED FIGURE 4 1 Sun StorEdge T3 Array Controller Card LEDs TABLE 4 1 Sun StorEdge T3 Array Controller Card LED Descriptions LED Action Description FC AL Channel Off Port disabled Active LED green Green Port enabled and idle Blinking green Port enabled and active Ethernet Active LED Off Link invalid green Green Link valid and idle Blinking green Link valid and active 100 BASE T Off Port disabled 10 Mbps rate Active LED green Green Port enabled and idle 100 Mbps rate Blinking green Port enabled and active Controller Status LED Off Controller not installed not gr
84. U has completed on page 70 The firmware upgrade procedures must be done through the Ethernet connection The latest firmware versions are located on the SunSolves web site http sunsolve sun com The current firmware file naming restrictions are as follows m The name consists of a string of 1 to 12 characters m The name must start with an alphabetic character and not a numeral For example m filel bin is acceptable m lfile bin is not acceptable m The characters can be a combination of the following a alphabetic letters m digits 0 through 9 m Special characters such as _ underscore m period a dollar symbol dash m Names are case sensitive For example ABC and abc are different files Make sure the latest firmware versions are installed and that the array configuration information indicates that the unit is ready for operation Check the firmware versions and array information in a telnet session with the array 32 Sun StorEdge T3 Array Field Service Manual November 2002 1 On the host use the telnet command with the array name or IP address to connect to the array For example telnet array name Tryrng 23123 l23 Oe Connected to 123 123 123 3 Escape character is Telnet session 123 123 123 3 2 Log in to the array by typing root and the supervisor password at the prompts The array prompt is displayed 3 Enter ver to identify the controlle
85. U remains enabled e Battery life span failure PCU remains enabled Note Verify a power and cooling unit failure using the CLI or Component Manager Note Even if the LED indicates a power cooling unit failure always verify the FRU status using the CLI before replacing the power cooling unit Refer to Checking FRU Status on page 35 for instructions Sun StorEdge T3 Array Field Service Manual November 2002 Pe Removing and Replacing a Power and Cooling Unit Caution To ensure correct airflow for system cooling both power and cooling units must be in the installed position for normal operation A failed power and cooling unit should be removed only when a replacement power and cooling unit is available to be inserted Caution Replace only one power and cooling unit at a time to prevent system interruption To replace a power and cooling unit Observe static electricity precautions See Static Electricity Precautions on page 5 Power off the power and cooling unit by pressing the power switch FIGURE 7 1 Make sure that the AC LED is amber and the PS LED is off FIGURE 7 2 Disconnect the power cord from the AC outlet Disconnect the power cord from the power and cooling unit connector by squeezing both sides of the connector and pulling straight out FIGURE 7 1 Unlock the power and cooling unit by using a coin or small screwdriver to push in and release the two lat
86. X XXX XXX XXX Where xxx xxx xxx xxx is the host IP address Reset the T3 system with the reset y command T3B EP gt reset y Press a key from a serial port connection when the system prompts to press a key within three seconds Install the firmware using the ep netloadl command T3B EP gt ep netloadl level 1 image filenam Set the bootmode to automatic T3B EP gt set bootmode auto Power cycle the array to reset it a Type lt 4 gt shutdown shutdown the system are you sure N y b Press the power button on each power and cooling unit to remove AC power c Press the power buttons again to return AC power to the array Level 2 Controller Firmware In an enterprise configuration the ep command downloads level 2 firmware to both the master unit and alternate master unit at one time To upgrade the Level 2 controller firmware perform the following steps Chapter 4 Controller Card Assembly 57 1 Use the ftp binary mode to transfer the firmware to the storage systems directory See Establishing an FTP Session on page 12 2 In a telnet session with the array install the level 2 image Type lt l gt ep download level 2_image_filename Level 3 Controller Firmware In an enterprise configuration this procedure downloads level 2 firmware to both the master unit and alternate master unit at one time To upgrade the Level 3
87. able unit FRU identifier to refer to a particular FRU in a Sun StorEdge T3 array TABLE 1 2 This identifier contains a unit constant u the unit number n a FRU constant ctr for controller card pcu for power and cooling unit 1 for unit interconnect card d for disk drive and the FRU number n TABLE 1 2 FRU Identifiers FRU Identifier Unit number Controller card unctr n unit number 1 2 Chapter 1 Troubleshooting Overview 3 TABLE 1 2 FRU Identifiers Continued FRU Identifier Unit number Power and cooling unit umpcun n unit number 1 2 n pcu number 1 2 Unit interconnect card unln n unit number 1 2 n interconnect number 1 2 Disk drive undn n unit number 1 2 n disk drive number 1 to 9 Sun Storage Automated Diagnostic Environment The Storage Automated Diagnostic Environment is a host based online health and diagnostic monitoring tool for storage area network SAN and direct attached storage DAS devices It can be configured to monitor on a 24 hour basis collecting information that enhances the reliability availability and serviceability RAS of the storage devices The Storage Automated Diagnostic Environment offers the following features m common web based user interface for device monitoring and diagnostics m Distributed test invocation by means of lists or topology m Topology grouping for multi level hosts and components m Alternate master
88. ady 30 voll uld4 ready enabled data disk ready ready 29 voll uld5 ready enabled data disk ready ready 28 voll uld6 ready enabled data disk ready ready 29 vol3 uld7 ready enabled data disk ready ready 34 vol3 uld8 ready enabled data disk ready ready 37 vol3 uld9 ready enabled data disk ready ready cun vol3 u2dl ready enabled data disk ready ready 34 vol2 u2d2 ready enabled data disk ready ready 38 vol2 u2d3 ready enabled data disk ready ready 36 vol2 u2d4 ready enabled data disk ready ready 37 vol2 u2d5 ready enabled data disk ready ready 34 vol2 u2d6 ready enabled data disk ready ready 35 vol4 u2d7 ready enabled data disk ready ready 35 vol4 u2d8 ready enabled data disk ready ready 40 vol4 u2d9 ready enabled data disk ready ready 36 vol4 LOOP STATUS STATE MODE CABLE1 CABLE2 TEMP u211 ready enabled master installed 2285 u212 ready enabled slave installed 316 30 ulll ready enabled master installed 29 5 ull2 ready enabled slave 5 installed 30 5 POWER STATUS STATE SOURCE OUTPUT BATTERY TEMP FAN1 FAN2 ulpcul ready enabled line normal normal normal normal normal ulpcu2 ready enabled line normal normal normal normal normal u2pcul ready enabled line normal normal normal normal normal u2pcu2 ready enabled line normal normal normal normal normal 70 Sun StorEdge T3 Array Field Service Manual November 2002 Upgrading Disk Drive Firmware The latest disk drive firmware versions are located on the SunSolve web site http sunsolve s
89. ages See the Sun StorEdge T3 Array Administrator s Manual for explanations of the more important error messages This chapter contains the following sections Message Syntax on page 174 m Message Types on page 174 FRU Identifiers on page 175 Miscellaneous Abbreviations on page 175 Interpreting Sun StorEdge T3 Array syslog Messages on page 176 The Basic Message Format on page 177 Interpreting ITL Messages in an FCAL Environment on page 178 Interpreting ITL Messages in a Fabric SAN Environment on page 180 Identifying Sun StorEdge T3 Array Ports and Loops on page 181 SVD SVC Error Messages on page 183 Disk Related Error Messages on page 185 Common Host Port FCC0 Messages on page 187 Assertion and Exception Reset Messages on page 188 Reset Log Message Types on page 191 m Reset Log Messages on page 192 Boot Messages on page 193 a Interpreting Boot Messages on page 193 Task List on page 201 Internal Sun StorEdge T3 Array AL_PA LID LOOP Map on page 203 SCSI Virtual Disk Driver SVD Error Definitions on page 204 Stripe Type Messages on page 205 SCSI Command Set on page 207 Arbitrated Loop Physical Addresses AL_PA and Loop IDs on page 209 Sense Key Explanations on page 211 173 Message Syntax Error message syntax consists of the following two components m Message Ty
90. aligned Problems with the backplane most likely occur because of an electrical short or a bent or broken pin connector These problems first appear as a failure of another FRU component such as an interconnect failure or drive failure If replacing the FRU that appears to be failed does not correct the problem then examine the backplane connector that the FRU connects to for bent or broken pins If nothing is obvious then install another spare FRU to verify that a failed FRU component is not causing the problem If all possibility of a FRU component failure has been eliminated and the problem still remains it is likely to be a backplane failure 125 Replacing the Chassis Backplane Assembly If there is a backplane failure replace it with the following procedure 1 Caution Replacing a Sun StorEdge T3 array chassis interrupts array operation Note If the Sun StorEdge T3 array is part of a partner group access to all volumes in the partner group is unavailable during this backplane replacement procedure Assess the impact of unmounting volumes and stopping applications prior to starting this procedure 1 Perform full backups of data on affected partner groups for all accessible volumes 2 From the data hosts quiesce all I O going to all volume s in that disk array and associated partner group m Notify all applications to stop accessing any affected volumes by unmounting the volume s or stopping the applic
91. ant au type xi d alimentation lectrique du local veuillez vous adresser au directeur de l exploitation ou un lectricien qualifi Caution Attention tous les cordons AN d alimentation n ont pas forc ment la m me puissance nominale en mati re de courant Les rallonges d usage domestique n offrent pas de protection contre les surcharges et ne sont pas pr vues pour les syst mes d ordinateurs Ne pas utiliser de rallonge d usage domestique avec votre produit Sun livr quip d un cordon d alimentation trois fils avec prise de terre Pour carter tout risque d lectrocution branchez toujours ce cordon dans une prise mise a la terre Caution Attention votre produit Sun a t L avertissement suivant s applique uniquement aux syst mes quip s d un interrupteur VEILLEUSE Caution Attention les commutateurs d alimentation de ce produit fonctionnent comme des dispositifs de mise en veille uniquement Ce sont les prises d alimentation qui servent a mettre le produit hors tension Vous devez d brancher TOUTES les prises d alimentation afin de couper l alimentation du produit Veillez donc installer le produit proximit d une prise murale facilement accessible Batterie au lithium Caution Attention sur la carte de contr le du syst me une batterie au lithium r f rence MK48T59Y MK48TXXB XX MK48T18 XXXPCZ M48T59W XXXPCZ M4T28 XXXYYSHZ ou MK48T
92. anual November 2002 Using tft pboot to Boot a Single Array or a Partner Group Remotely If you have a partner group that cannot boot on its own you can use tftp boot to boot it remotely Note The tftpboot server must be on the same subnet as the array To remotely boot a Sun StorEdge T3 array Set up the remote server See Configuring a Server for Remote Booting on page 16 Unplug the Ethernet cable connected to the alternate master Leave the Ethernet cable on the master connected Get to the array EPROM as described in Establishing a Serial Port Connection on page 7 Set the array boot mode to tftpboot T3 1 gt set bootmode tftp T3 1 gt set bootmode auto bootdelay 3 sn 112035 ip 10 4 35 134 netmask 299 5 25 D 2900 gateway 10 4 35 tftphost 123 123 123 6 tftpfile releases nb210 nb210p20 bin hostname gatest timezone GMT 00 vendor 0301 model 501 5710 02 51 revision 020100 logto Aug9 loglevel 3 rarp off mac 00 20 2 00 03 b9 Chapter 2 Connecting to the Sun StorEdge T3 Array 13 5 Set tftphost IP address and t ftp filename 300 EP gt set tftphost 123 123 123 6 T300 EP gt set tftpfile filename bin T300 EP gt set bootmode tftp bootdelay 3 sn 000596 ip 123 123 123 99 netmask 225 29 29550 gateway 129 153 49 254 tftphost 129 153 49 2 tftpfile nb210 bin hostname purple31 timezone vendor 0301 model 501 5710 02 51 revision 0200 logto sy
93. ard at a time Pulling both interconnect cards at one time could cause a system shutdown Follow the procedure as described to ensure that there is no interruption in system operation or loss of data To prevent interruption of the data host system operation during interconnect card replacement ensure that a In a single controller unit configuration remove only the failed interconnect card Leave the second interconnect card intact in the array m Ina partner group remove the interconnect cable only from the failed interconnect card Leave the interconnect cable attached to the working interconnect card To replace an interconnect card 1 Ensure that the interconnect card to be replaced is showing failure status Refer to FIGURE 6 1 2 Observe static electricity precautions See Static Electricity Precautions on page 5 3 Remove the interconnect cable from the failed interconnect card only Note If a single controller unit configuration ignore this step and proceed to Step 4 Mark the connector with either 1 or 2 Chapter 6 Interconnect Card Assemblies 77 4 Unlock the failed interconnect card by pushing in on the latch handle Use a coin or small screwdriver to press in and release the latch handle Latch handle FIGURE 6 2 Removing the Interconnect Card 5 Pull the interconnect card out using the latch handle Caution The interconnect card that is removed must be replaced within 30 minutes
94. arget id 1 is LUN This example shows a fatal timeout on the cache mirroring LUN for LUN 1 on u2 This is evident by using the information from the previous discussion of lids the ALPA chart and port listmap We now know two things 184 Sun StorEdge T3 Array Field Service Manual November 2002 m portid target 0 is the cache mirror LUN 1 on u2 m LUN 1 is on u2 The Gauntlet When a controller issues a command to its partner it starts a watchdog timer for the command And if the command is not complete within the required time frame the controller will timeout the command as a fatal error This indicates that u2 tried to write to its cache mirror and couldn t Therefore it timed out the command This example is from a case where ul had failed and was eventually replaced Disk Related Error Messages Disk drives have CRC and ECC protection on all sectors so they can detect whether or not data is read correctly and in some cases use the ECC to correct the data Many disk errors consist of more than a single syslog entry Frequently the event occurring on with or to the disk will generate other system events such as a PATH failover or the disabling of the disk after too many errors The key is to look for is clusters of messages After a certain threshold the active controller disables the failing drive Basic Format of Messages SCSI Disk Error occurred path 0x0 port 0xc lun 0x0 where m SCSI Disk Error occu
95. as been reset 0x9 VENDOR SPECIFIC This sense key is available for reporting vendor specific conditions 0x0 VOLUME OVERFLOW This indicates that a buffered peripheral device has reached the end of partition and data may remain in the buffer that has not been written to the medium A RECOVER BUFFERED DATA command s may be issued to read the unwritten data from the buffer Appendix C Sun StorEdge T3 Array Messages 213 214 Sun StorEdge T3 Array Field Service Manual November 2002 APPENDIX D Sun StorEdge T3 Array System Commands This appendix lists the commands supported by the Sun StorEdge T3 array and is divided into the following sections m Commands List on page 215 m FRU Identifiers on page 217 Commands List To view the available command line interface CLI commands on the array type help at the prompt lt 184 gt help arp cat cd cmp help 1s mkdir mv tail touch boot disable disk enable more ntp passwd port sync sys Ezset ver refresh route ofdg lun cp ping fru proc vol hwwn date pwd id reset volslice echo rm logger set head rmdir lpc shutdown ep Note Use the login prompt to set the IP address netmask and hostname instead of using the EP prompt Setting these parameters at the EPROM level will be lost 215 To display command syntax use the command name help command For example for information on the reset command type lt
96. ast_test multiple times one time for each loop The ofdg fast_test Option The fast_test option provides a fast Go No Go Loop test The fast_test option performs the following steps 1 LAC_Reserve the FC AL Loop device under test DUT 2 Test next nearest enclosure on Loop DUT 3 Repeat Step 2 until all enclosures are tested 4 LAC_Release the FC AL Loop device under test DUT The fast_test option uses only the two worst case data patterns as shown below define ONDG_PATTERN_FOUR 0x7E7E7E7E from SUN define ONDG_PATTERN_SIX 0x4A4A4A4A from SUN For each data pattern the fast_test option performs the following m 2 synchronous Write Read Compares at 64 KB m 250 asynchronous Read Writes at 64 KB m Monitors for errors using all the FC AL port counters on the Loop DUT plus the counters from the single disk DUT Chapter 8 Diagnosing and Correcting FC AL Loop Problems 113 The ofdg fast_find Option The fast_find option provides a fast Go No Go Loop test identical to fast_test plus a simplified Loop Fault Diag The fast_find option performs the following steps 1 LAC_Reserve the FC AL Loop device under test DUT 2 Reconfigure Loop via MUX with next nearest enclosure on Loop DUT 3 Test next nearest enclosure on Loop DUT 4 Repeat Step 2 and Step 3 until all enclosures are tested 5 LAC_Release the FC AL Loop device under test DUT The big difference
97. at are being added to an existing hub configuration To change the port ID on a Sun StorEdge T3 array 1 Connect to the array in a telnet session See Establishing a Telnet Session on page 9 for instructions 2 Use the port command on the array to change the port ID You must select a new numerical value for the port identifier For example to change a port id on ulp1 from a value of 1 to a value of 20 Type lt 1 gt port set ulpl targetid 20 3 On the array type reset for the new port ID to take effect 158 Sun StorEdge T3 Array Field Service Manual November 2002 APPENDIX A Illustrated Parts Breakdown This appendix contains part numbers and illustrations of field replaceable units FRUs The following assemblies are illustrated in this chapter Sun StorEdge T3 Array on page 160 Sun StorEdge T3 Array Assemblies on page 161 Door Assembly on page 162 Interconnect Card Assembly on page 163 Power Supply and Cooling Unit on page 164 Controller Card on page 165 Drive Assembly on page 166 Cable and Interconnect Assemblies on page 167 159 Sun StorEdge T3 Array FIGURE A 1 Sun StorEdge T3 Array Front View 160 Sun StorEdge T3 Array Field Service Manual November 2002 Sun StorEdge T3 Array Assemblies FIGURE A 2 Sun StorEdge T3 Array Back View TABLEA 1 Sun StorEdge T3 Array Assemblies Item Part Number F540 4306 F370 3
98. at you type when contrasted with on screen computer output AaBbCc123 Book titles new words or terms words to be emphasized glossary terms Command line variable replace with a real name or value Edit your login file Use ls a to list all files 2 You have mail su Password Read Chapter 6 in the User s Guide These are called class options The user must be superuser to do this To delete a file type rm filename Shell Prompts TABLE P 2 Shell Prompts Shell C shell C shell superuser Bourne shell and Korn shell Bourne shell and Korn shell superuser xxx Sun StorEdge T3 Array Field Service Manual November 2002 Prompt machine_name machine_name Related Documentation Application Latest array updates Installation overview Safety procedures Site preparation Installation and Configuration Administration Cabinet installation Disk drive specifications Host Bus Adapters Title Sun StorEdge T3 Array Release Notes Sun StorEdge T3 Array Start Here Sun StorEdgeT3 Array Regulatory and Safety Compliance Manual Sun StorEdge T3 Array Site Preparation Guide Sun StorEdge T3 Array Installation and Configuration Manual Sun StorEdge T3 Array Administrator s Manual Sun StorEdge T3 Array Installation and Configuration Manual 18 Gbyte 1 inch 10K rpm Disk Drive Specifications 36 Gbyte 10K rpm 1 Inch Disk Drive Specifications 73 Gbyte 10K rpm
99. atch to the arrays while they were connected in the partner group the files contained on the array s reserved system area are not upgraded on the alternate master but only on the master unit When the units are disconnected the alternate master unit reverts to the file system stored on its reserved system area To correct this situation and ensure that the array is ready for operation 1 Install the latest firmware patch on the array This patch is available on the SunSolve web site http sunsolve sun com a From the SunSolve web site select Patches under the SunSolve Online column b Select the Storage Products option from the Patches web page Refer to the README file on the web page for specific details on installing the patch for the Sun StorEdge T3 array firmware 2 Create a volume and initialize it 3 Use the vol list and vol stat commands to verify that the volume s is mounted correctly For example lt 7 gt vol list capacity raid data standby 134 890 GB 5 uldi 5 none lt 8 gt vol stat uldl uld2 uld3 uld4 uld5 0 0 0 0 0 4 Use the vol init voll fast command to preserve the old alternate master s data Chapter 10 Hardware Reconfiguration 155 5 Use the fru list and fru stat commands to verify that the array is functional and ready for operation For example lt 9 gt fru list
100. ath_id 1d6 SVD_PATH_FAILOVER path_id ld7 SVD_PATH_FAILOVER path_id 1d8 SVD_PATH_FAILOVER path_id 1d9 SVD_PATH_FAILOVER path_id lctr ISP not ready on loop 1 1d1 Bypassed on loop 1 1d2 Bypassed on loop 1 ld3 Bypassed on loop 1 CO CO CD CO CD 1d4 Bypassed on loop 1 ld5 Bypassed on loop 1 1d6 Bypassed on loop 1 147 Bypassed on loop 1 1d8 Bypassed on loop 1 1d9 Bypassed on loop 1 lctr ISP not ready on loop 1 letr ISP2200 0 Received LIP f8 dl async event 1d1 Not bypassed on loop 1 1d2 Not bypassed on loop 1 1d3 Not bypassed on loop 1 1d4 Not bypassed on loop 1 1d5 Not bypassed on loop 1 ld6 Not bypassed on loop 1 147 Not bypassed on loop 1 1d8 Not bypassed on loop 1 ld9 Not bypassed on loop 1 fcal ports were detected on 11 lctr ISP2200 0 Received LIP f7 ef async event fcal ports were detected on 11 111 ONDG No Loop Trouble Found o 120 Sun StorEdge T3 Array Field Service Manual November 2002 ay ay ay ay ay ay ay ay ay ay ay ay ay 26 26 26 26 26 26 26 26 26 26 26 26 26 LO O AO IO O O O O O OO WO 0 D ND ND ND NN NN NN ND NN ND ND ND UN Oe Sl 0 10 O OO Oo o a 04 205 05 05 05 20 5 05 10 10 10 rO 10 113 CFGT 1 ISR1 ISR1 ISR1 ISR1 ISR1 ISR1 BELP O DG 1 DG 1 DG 1 DG 1 DG 1 ulctr Relea
101. ation if necessary m Verify that all drive activity has stopped The drive activity LEDs become solid green indicating that the drives are idle 3 If any volume manager software is running such as VERITAS disable transactions to the volumes that reside on the Sun StorEdge TT3 array backplane you are replacing and to all other volumes in that partner group m Consult the appropriate volume manager documentation for information on disabling the data hosts access to the Sun StorEdge T3 array volumes 4 Execute the shutdown command lt 1 gt shutdown Shutdown the system are you sure N y 126 Sun StorEdge T3 Array Field Service Manual November 2002 5 Power down the failed disk array Press the power button once on each power and cooling unit to turn the switch off FIGURE 9 1 Power switches Il LlLl oo qo Do FIGURE 9 1 Power Switch Location All arrays power down automatically when any one array in the partner group is powered down 6 Record the Sun StorEdge T3 array system serial number and MAC address Locate the pull out tab at the left side of the array next to the first disk drive as shown in FIGURE 9 2 This tab contains the array serial number and media access control MAC address The serial number is located on the top left portion of the pull out tab and begins with the part number 595 xxxx Record this information to transcribe i
102. ative aux polices de caract res est prot g par un copyright et licenci par des fournisseurs de Sun Des parties de ce produit pourront tre d riv es des syst mes Berkeley BSD licenci s par l Universit de Californie UNIX est une marque d pos e aux Etats Unis et dans d autres pays et licenci e exclusivement par X Open Company Ltd Sun Sun Microsystems le logo Sun AnswerBook2 docs sun com JumpStart Sun StorEdge Storage Automated Diagnostic Environment SunSolve et Solaris sont des marques de fabrique ou des marques d pos es ou marques de service de Sun Microsystems Inc aux Etats Unis et dans d autres pays Toutes les marques SPARC sont utilis es sous licence et sont des marques de fabrique ou des marques d pos es de SPARC International Inc aux Etats Unis et dans d autres pays Les produits portant les marques SPARC sont bas s sur une architecture d velopp e par Sun Microsystems Inc L interface d utilisation graphique OPEN LOOK et Sun a t d velopp e par Sun Microsystems Inc pour ses utilisateurs et licenci s Sun reconna t les efforts de pionniers de Xerox pour la recherche et le d veloppement du concept des interfaces d utilisation visuelle ou graphique pour l industrie de l informatique Sun d tient une licence non exclusive de Xerox sur l interface d utilisation graphique Xerox cette licence couvrant galement les licenci s de Sun gui mettent en place l interface d utilisation graphique OPEN
103. ay Generated Messages on page 2 for more information about array generated messages Host generated message found in the var adm messages file indicating a problem with the host channel connection to the array unit See Host Generated Message on page 2 for more information about host generated messages Troubleshooting Checks The connection between the host and the Sun StorEdge T3 array as described in Storage Automated Diagnostic Environment Link Test on page 27 The array boot status as described in Checking Array Boot Status on page 27 FRU status as described in the Sun StorEdge T3 Array Administrator s Manual Array status as described in Telnet Connection Status Checks on page 30 Array operation as described in Testing the Array With Storage Automated Diagnostic Environment on page 36 Miscabled partner groups as described in Identifying Miscabled Partner Groups on page 36 Data channel as described in Identifying Data Channel Failures on page 39 Chapter 3 Diagnosing T3 Array Problems 25 26 Verifying the Data Host Connection To verify the physical connection between the host and the array use a utility such as the format command in the Solaris environment The output of the command confirms whether a volume is on the array For example On the application host enter format at the supervisor prompt format Searching for di
104. by the system are displayed pSOSystem also identifies any controllers not responding or if the master has fa 28 Sun StorEdge T3 Array Field iled over to the alternate master in the boot messages Service Manual November 2002 Firmware status codes are good indicators of internally detected system and configuration problems In the previous boot message example a firmware status of 3 is displayed This status implies the array is ready for operation TABLE 3 2 lists other firmware status codes that can be reported through the serial port console during a array boot cycle TABLE 3 2 Firmware Status Indicators Status Definition 0 ISP is waiting for configuration process to complete 1 ISP is waiting for ALPA assignment ISP is waiting for port login ISP is ready and optimal ISP has lost loop synchronization ISP has experienced an unrecoverable error Reserved N FD O A O N ISP is not participating on the loop Once the array has fully booted all the commands available through the CLI are accessible Note If you make configuration changes at the EPROM prompt they can be overwritten when the array boots completely Check the array settings after the array has booted to ensure that they are correct A message such as the following might appear after you log in 6 1 device not mounted It is possible that the serial cable is connected to the alternate master unit instead of the master unit To dete
105. c ethers and etc hosts files edit the host and ethers entries in the etc nsswitch conf file so that the files parameter appears before the NOTFOUND return statements hosts nis files NOTFOUND return ethers nis files NOTFOUND return 4 Determine if the RARP daemon is running by typing ps eaf grep rarpd m If the RARP daemon is running proceed to Step 6 m If the RARP daemon is not running proceed to the next step 5 Start the RARP daemon in the Solaris software environment by typing usr sbin in rarpd a amp 6 Power on both arrays by pressing the power button on each power and cooling unit All power and cooling unit LEDs on both units turn green indicating that power has been restored The IP address automatically downloads to the master controller unit after you power on Note In some cases the array times out before receiving the RARP request through an Ethernet switch If this time out happens the array cannot receive the assigned IP address An improper spanning tree setting of the Ethernet switch might cause this time out Refer to your switch vendor documentation for information on spanning tree settings and how to change them Changing this setting properly enables the array to receive the RARP request before timing out Chapter 10 Hardware Reconfiguration 143 Defining and Mounting Volumes on the Alternate Master Once the units are cabled and power has been
106. cabinet 12 Connect all cables previously removed but do not power up the array s Note If the array is part of a partner group make sure that the host FC AL cables are recabled to the same Sun StorEdge T3 array FC AL connections that they were removed from as you noted down in Step 7 Also ensure that loop cables are properly_recabled 13 Add the T3 array serial number and the MAC address to the new chassis Locate the pull out tab at the left side of the array next to the first disk drive Use a fine tipped permanent marker to write the information on this tab you also need the information for the next two steps Pull out tab FIGURE 9 4 Serial Number and MAC Address on Pull out Tab 14 Contact the appropriate Contract Administrator CA of the Contracts Verification Group CVG to relay the system serial number and new chassis information 130 Sun StorEdge T3 Array Field Service Manual November 2002 15 16 17 18 19 On the RARP server update the etc ethers file Replace the MAC address entry of the failed chassis with the MAC address of the new chassis For example 8 0 20 6d 93 7e array name In this example m 8 0 20 6d 93 7e is the new MAC address m array name is the name of the old array Note that if the failed unit was an alternate master the unit s MAC address may not be in the etc ethers file In this case no file changes are required Verify that th
107. cal system files To ensure the Solaris software environment uses the changes made to etc ethers and etc hosts files edit the host and ethers entries in the etc nsswitch conf file so that the files parameter appears before the NOTFOUND return statements as shown hosts nis files NOTFOUND return ethers nis files NOTFOUND return d Determine if the RARP daemon is running by typing ps eaf grep rarpd a If the RARP daemon is running proceed to Step 3 a If the RARP daemon is not running continue to Step e e Start the RARP daemon in the Solaris environment by typing usr sbin in rarpd a amp 3 Ensure that there is an Ethernet connection to the 100BASE T port of the top unit 4 Press the power switch on the power and cooling units on both arrays to remove AC power FIGURE 3 5 It may take some time for the units to power off while shutdown procedures are performed Wait until the units have powered off completely Power switches FIGURE 3 5 Power Switch Locations 38 Sun StorEdge T3 Array Field Service Manual November 2002 5 After both units have powered off press the power switch on the power and cooling units again to restore power to and reset the arrays It may take up to several minutes for the arrays to power on and come back online All LEDs will be green when the unit is fully powered on 6 After the units are
108. ch handles FIGURE 7 3 Pull the power and cooling unit out of the array Put one index finger through each of the latch handles With your thumbs on the top of the chassis for support pry the power and cooling unit out of its connectors with an upward rotation Once it is out approximately 1 inch 2 5 cm the unit will be free to slide out of the frame on its rails Caution Any power and cooling unit that is removed must be replaced within 30 minutes or the Sun StorEdge T3 array and all attached arrays automatically shut down and power off Chapter 7 Power and Cooling Unit Assemblies 85 10 11 12 Latch handle Latch handle FIGURE 7 3 Removing the Power and Cooling Unit Insert the new power and cooling unit Lock the new power and cooling unit by pushing in both latch handles Insert the power cord into the power and cooling unit connector Connect the power cord into the AC outlet Verify that the AC LED on the power and cooling unit is amber indicating that AC power is present Push the power and cooling unit power switch on Verify that both LEDs on the power and cooling unit are green indicating that the unit is receiving power Verify the status of the power and cooling unit using the CLI Refer to Checking FRU Status on page 35 for instructions Note After installing the new power and cooling unit the batteries will take some time to recharge 86 Sun StorEdge T3 Array F
109. cking Performance Against Baseline Data on page 107 Storage Automated Diagnostic Environment Message Monitoring on page 108 Manual Examination of the syslog File on page 108 Example syslog Error Messages on page 109 Using CLI Diagnostic Commands on page 110 Chapter 8 Diagnosing and Correcting FC AL Loop Problems 105 m Using the ofdg Diagnostic Utility on page 111 FC AL Loop Problem Indicators The following symptoms indicate possible FC AL loop problems 1 The first indication observed by a customer might be performance degradation in the suspect array See Checking Performance Against Baseline Data on page 107 for more detail 2 A second indication might be Storage Automated Diagnostic Environment StorADE message monitoring from a host that is receiving remote array syslog messages Storage Automated Diagnostic Environment monitoring can be configured to look for particular message classes in the log file that the array entries are written to The program looks through this log file at a customer determined frequency for the specified type of messages and sends e mail if a match is made Typically Storage Automated Diagnostic Environment message monitoring is configured to scan for warning or error messages These message can also be examined in the array s local syslog The e mail recipient can be the customer or any other destination the customer desires See Storage Automated Diagnostic
110. controller firmware perform the following steps 1 Use the ftp binary mode to transfer the firmware to the storage systems directory See Establishing an FTP Session on page 12 2 In a telnet session with the array set the bootmode to auto lt 2 gt set bootmode auto 3 Install the level 3 image on the array lt 3 gt boot i level 3_image_filename 4 Reset the array lt 4 gt shutdown shutdown the system are you sure N y a Press the power button on each power and cooling unit to remove AC power b Press the power buttons again to return AC power to the array Note If during the boot process a controller detects a level 3 firmware version on the system disk different than the level 3 image loaded in flash the controller will reflash its local level 3 image and reset This can appear as two sequential boot cycles This process is expected behavior 58 Sun StorEdge T3 Array Field Service Manual November 2002 CHAPTER 5 Disks and Drives This chapter describes how to monitor and replace the disk drives upgrade the firmware and repair corrupted disk labels This chapter contains the following sections m Monitoring Drive Status on page 59 m Disk Drive LEDs on page 63 m Repairing Disk Drives on page 64 m Check the drive status to ensure that the reconstruction of the replaced drive FRU has completed on page 70 Monitori
111. dido que su equipo tenga es posible gue se utilice uno de los siguientes s mbolos Caution Apagado Elimina la alimentaci n de CA del sistema Caution En espera El interruptor de Ch Encendido En espera se ha colocado en la posici n de En espera Modificaciones en el equipo No realice modificaciones de tipo mec nico o el ctrico en el eguipo Sun Microsystems no se hace responsable del cumplimiento de las normativas de seguridad en los eguipos Sun modificados Ubicaci n de un producto Sun Caution Precauci n Para asegurar la fiabilidad de funcionamiento de su producto Sun y para protegerlo de sobrecalentamientos no deben obstruirse o taparse las rejillas del eguipo Los productos Sun nunca deben situarse cerca de radiadores o de fuentes de calor norma DIN 45 635 secci n 1000 se admite un nivel de presi n ac stica para puestos de trabajo m ximo de 70Db A f Caution Precauci n De acuerdo con la Cumplimiento de la normativa SELV El estado de la seguridad de las conexiones de entrada salida cumple los requisitos de la normativa SELV Conexi n del cable de alimentaci n el ctrica Caution Precauci n Los productos Sun AN est n dise ados para trabajar en una red el ctrica monof sica con toma de tierra Para reducir el riesgo de descarga el ctrica no conecte los productos Sun a otro tipo de sistema de alimentaci n el ctrica P ngase en contacto
112. ding level 3 code into RAM the level 2 code jumps to this location The starting location for download For example the level 2 code downloads the level 3 code from ROM into RAM starting at this location However notice that when the starting location is 0x20000 the image in the file is loaded to 0x20040 in RAM The first 0x40 bytes are occupied by the header information That is to say that the code_base includes the space occupied by the header information Each level of boot code has a unique signature For example the level 3 signature is P2L3 The revision of the code The subrevision of the code The date stamp of the code For example 20001225 means 2000 12 25 The time stamp of the code For example 01020300 means 01 02 03 Chapter 4 Controller Card Assembly 55 TABLE 4 2 Channel Active LED Descriptions Header Description hdr_counter For the file to be downloaded to ROM by the ep command to RAM through tftp this field should be 1 But after the code is programmed into ROM the ep command will change this field automatically This field is used to identify which of two copies of level 2 code or level 3 code is newer The smaller the value is the older the code is Thus OxFFFFFFFF is older than 0xFFFFFFFE The ep command will automatically update this field by taking the value of this field from the other copy and add 1 to the value code_flags This field is used to identify whether special handl
113. disk linkstat uld1 9 path 1 DISK LINKFAIL LOSSSYNC LOSSSIG PROTO 30 30 12 249 30 4 30 30 30 ERR INVTXWORD I CO GOGU O o VCRC ldl 1d2 1d3 1d4 1d5 1d6 1d7 1d8 1d9 GG Go Ge GG Ga ENG 9 o eo o Lo o uw a O o o 0o COCO Oro Or One D OOOO OO C5 gt pe gt 30 30 1 198 30 1 19 30 30 co o 0 0 0 Sun StorEdge T3 Array Field Service Manual November 2002 CODE EXAMPLE 8 7 disk linkstat Command Split Loop Ouput From U2 Controller lt 26 gt disk linkstat u2d1 9 path 0 DISK LINKFAIL LOSSSYNC LOSSSIG PROTOERR INVTXWORD INVCRC u2d1 Disk Link Status Failed u2d2 Disk Link Status Failed u2d3 Disk Link Status Failed u2d4 Disk Link Status Failed u2d5 Disk Link Status Failed u2d6 Disk Link Status Failed u2d7 Disk Link Status Failed u2d8 Disk Link Status Failed u2d9 Disk Link Status Failed fail lt 27 gt disk linkstat u2d1 9 path 1 DISK LINKFAIL LOSSSYNC LOSSSIG PROTOERR INVTXWORD INVCRC u2dl 0 0 0 0 1 0 u2d2 0 0 0 0 30 0 u2d3 0 0 0 0 al 0 u2d4 0 0 0 0 30 0 u2d5 0 0 0 0 30 0 u2d6 0 6 0 0 30 0 u2d7 0 0 0 0 30 0 u2d8 0 0 0 0 30 0 u2d9 0 0 0 0 1 0 pass Diagnosing an FC AL Loop This section describes how to diagnose an FC AL loop problem This section contains the following sub sections FC AL Loop Problem Indicators on page 106 Che
114. e etc hosts file contains the previous IP address and array name For example 192 129 122 111 array name In this example 192 129 122 111 is the IP address assigned previously Verify that the etc nsswitch conf file on the RARP server references the local system files To ensure that the Solaris software environment uses the changes made to the etc ethers and etc hosts files edit the host and ethers entries in the etc nsswitch conf file so that the files parameter appears before the NOTFOUND return statements For example hosts nis files NOTFOUND return ethers nis files NOTFOUND return Ensure that the RARP daemon is running on the RARP server rarpserver ps eaf grep rarpd If the RARP daemon is not already running on the RARP server start it by entering rarpserver usr sbin in rarpd a amp Chapter 9 Chassis Backplane Assembly 131 20 Verify that AC power is present on each of the chassis power and cooling units The AC LED on each power and cooling unit glows solid amber and the fans turn at low speed 21 Press the power button on the power and cooling units to power on the array s FIGURE 9 1 shows the power button location The AC and power supply PS LEDs on the power and cooling units show green After you power on the Sun StorEdge T3 array JumpStart feature reassigns the array s previous IP address to the new MAC address Allow time to complete the boot cycle W
115. e T3 Array Field Service Manual is designed to provide the qualified service trained maintenance provider with sufficient information to effectively troubleshoot and resolve any Sun StorEdge T3 array failure The procedures in this manual describe how to isolate the failure remove and replace component s effectively reconfigure the module and system and place the product back into the customer s network Before You Read This Book Make sure you have prepared by reviewing the Sun StorEdge T3 Array Installation and Configuration Manual Sun StorEdge T3 Array Administrator s Manual and Sun StorEdge T3 Array Release Notes Work with the site system administrator to determine if any external hardware or software products are required to repair this device xxvii How This Book Is Organized This manual is organized as follows Chapter 1 provides a troubleshooting overview on the Sun StorEdge T3 array Chapter 2 describes how to connect to and boot the Sun StorEdge T3 array Chapter 3 provides the qualified service provider with troubleshooting techniques for the Sun StorEdge T3 array Chapter 4 describes how to monitor and replace the controller card and upgrade the firmware Chapter 5 describes how to monitor and replace the disk drives and upgrade the firmware Chapter 6 describes how to monitor and replace the interconnect card and upgrade the firmware Chapter 7 describes how to replace the power and coolin
116. e T3 array In a fully redundant configuration cache is set to write behind mode In a nonredundant configuration cache is set to write through Read caching is always performed The Sun StorEdge T3 array default that automatically disables a disk drive that has failed The Sun StorEdge T3 array default that automatically reconstructs data onto a new disk drive from one of the other drives The process of data transfer between the host and the drives 225 C command line interface CLD controller unit E erasable programmable read only memory EPROM F Fibre Channel Arbitrated Loop FC AL field replaceable unit FRU G gigabit interface converter GBIC gigabyte GB or Gbyte The interface between the Sun StorEdge T3 array s pSOS operating system and the user in which the user types commands to administer the array A Sun StorEdge T3 array that includes a controller card It can be use as a standalone unit or configured with other Sun StorEdge T3 arrays Memory stored on the controller card useful for stable storage for long periods without electricity while still allowing reprogramming A 100 MB sec serial channel which allows connection of multiple devices disk drives and controllers A component that is easily removed and replaced by a field service engineer or a system administrator An adapter used on an SBus card to convert fiber optic signal to copper One gigab
117. e all I O going to all volume s in that disk array and associated partner group Notify all applications to stop accessing any affected volumes This may require stopping the application Verify that all drive activity has stopped The solid green drive activity LEDs indicate that the drives are idle 3 If the disk array is using any volume manager software such as VERITAS disable transactions to the volumes that reside on the array backplane you wish to replace and all other volumes in that partner group Consult the appropriate volume manager documentation for information on disabling the data hosts access to the array volumes Chapter 8 Diagnosing and Correcting FC AL Loop Problems 117 nN Unmount the volume s from the Solaris host unmount T3 filesystem name 5 Unmount the internal array volume s lt 4 gt vol unmount voll 6 Disconnect the fiber optic cables from the array controllers N Establish a serial connection and Tip session to the Master RAID controller of the problem array See Establishing a Serial Port Connection on page 7 8 Execute the set command and note the current values of logto and loglevel lt 1 gt set bootmode auto bootdelay 3 sn 112035 ip 10 4 35 134 netmask 255 255 255 0 gateway 10 4 35 1 tftphost 123 123 123 6 tftpfile releases nb210 nb210p20 bin hostname gatest timezone GMT 00 vendor 0301 model 501 5710 02 51 revision 020100 logto Aug9
118. e down 91 xxiii FIGURE 7 5 FIGURE 7 6 FIGURE 7 7 FIGURE 7 8 FIGURE 9 1 FIGURE 9 2 FIGURE 9 3 FIGURE 9 4 FIGURE 10 1 FIGURE 10 2 FIGURE 10 3 FIGURE 10 4 FIGURE A 1 FIGURE A 2 FIGURE A 3 FIGURE A 4 FIGURE A 5 FIGURE A 6 FIGURE A 7 FIGURE A 8 FIGURE C 1 Removing the Screws from the PCU Bottom Panel 91 Lifting the PCU Bottom Panel and Battery Slightly Away from the Unit 92 The Battery Connector Details Inside the PCU 93 UPS Battery Setting Right Side Up 94 Power Switch Location 127 Serial Number and MAC Address on Pull out Tab 127 Removing the Chassis 128 Serial Number and MAC Address on Pull out Tab 130 Connecting the Interconnect Cables 140 Fully Cabled Partner Group 141 Location of Pull Out Tab With MAC Address 142 Interconnect Cable Location 151 Sun StorEdge T3 Array Front View 160 Sun StorEdge T3 Array Back View 161 Door Assembly 162 Interconnect Card Assembly 163 Power Supply 164 Controller Card 165 Drive Assembly 166 Cables and Interconnects 167 Loop Port Diagram 181 xxiv Sun StorEdge T3 Array Field Service Manual November 2002 TABLE 1 1 TABLE 1 2 TABLE 3 1 TABLE 3 2 TABLE 4 1 TABLE 4 2 TABLE 5 1 TABLE 5 2 TABLE 6 1 TABLE 7 1 TABLE A 1 TABLE A 2 TABLE A 3 TABLE A 4 TABLE A 5 TABLE A 6 TABLE A 7 TABLE B 1 TABLE B 2 TABLE B 3 Tables Levels of Message Notification 3 FRU Identifiers 3 Dia
119. e substituted vol recon to stanby drive has completed 60 Sun StorEdge T3 Array Field Service Manual November 2002 Checking the Hot Spare 1 Use the vol list command to check the location of the hot spare standby drive lt 41 gt vol list volume voll vol2 vol3 vol4 capacity 134 134 101 101 890 GB 890 GB 167 GB 167 GB raid o O1 o1 data uld1 5 u2d1 5 uld6 9 u2d6 9 standby none none none none 2 Use the vol stat command to check the status of the hot spare drive lt 42 gt vol stat voll mounted vol2 mounted vol3 mounted vol4 mounted uldl 0 u2d1 0 uld6 0 u2d6 0 All drives should show a status of 0 See TABLE 5 1 for definitions of drive status codes uld2 u2d2 uld7 u2d7 0 uld3 u2d3 uld8 u2d8 0 uld4 u2d4 uld9 u2d9 0 uld5 u2d5 Chapter 5 Disks and Drives 61 Checking Data Parity Caution It can take up to several hours for the parity check once the vol verify command is executed Execution of this command might affect system performance depending on system activity and the verification rate selected Use the vol verify command to perform a parity check of the drives lt 7 gt vol verify volume name You can also use the fix and rate options lt 7 gt vol verify volume name fix rate lt 1 8 gt Where m fix recalculates and rewrites the parity block if
120. econds Then re install the Interconnect cards 5 If configured as an enterprise configuration disconnect the interconnect cables from the alternate master controller See FIGURE 3 6 40 Sun StorEdge T3 Array Field Service Manual November 2002 Alternate Ethernet master connection controller unit Application host Interconnect cables HBAs Master controller unit FC AL connection Management host Ethernet connection G Ethernet port Secure private LAN FIGURE 3 6 Single Host With Two Controller Units Configured as a Partner Group 6 Power on the array or the master controller unit of an enterprise configuration The array starts to boot automatically 7 Stop the boot process at the cancellation message by pressing the Return key T3B EP Release 2 00 2001 06 22 16 07 00 172 20 57 31 Copyright C 1997 2001 Sun Microsystems Inc All Rights Reserved Found units ul ctr u2 ctr tftp boot is enabled hit the RETURN key within 3 seconds to cancel Cancelled T3B EP gt Chapter 3 Diagnosing T3 Array Problems 41 8 Set the array to boot from the tftp boot server See Using tftpboot to Boot a Single Array or a Partner Group Remotely on page 13 Verify that the bootmode tftp host and tftp file settings are correct 300 EP gt set bootmode tftp
121. een or amber recognized Green Controller OK Amber Controller boot shutdown or firmware download in progress Blinking amber Controller failure OK to replace controller Note Verify a controller card failure using the CLI 48 Sun StorEdge T3 Array Field Service Manual November 2002 Removing and Replacing a Controller Card Note A new feature of the version 2 0 controller firmware is Autoversioning This feature allows you to seamlessly update from a Sun StorEdge T3 array to a Sun StorEdge t3 array When a Controller card is replaced Autoversioning ensures that the new controller is flashed with the latest firmware version of the existing array controller of an enterprise configuration and that both controllers are therefore running the same firmware version A controller card can be replaced without system interruption only if the array is configured in a partner group redundant controller unit configuration Caution A removed controller card must be replaced within 30 minutes or the Sun StorEdge T3 array and all attached arrays will automatically shut down and power off To replace the controller card Observe static electricity precautions See Static Electricity Precautions on page 5 Ensure that the controller card is showing failure status Remove the Ethernet cable from the 100BASE T connector Remove the fiber optic cable and MIA if applicable from the FC AL connector
122. efresh s command 88 removing and replacing 90 service life 90 blocksize 170 boot auto 9 commands 9 defaults 169 how to 27 i option 15 mode 13 tftp 13 tftp server 16 bootdelay 169 bootmode 169 C cable assemblies 167 cabled partner group 141 cache 170 cache memsize 171 chassis see midplane 125 commands descriptions of 215 See Also individual commands connecting the cables interconnect cables 140 controller cards 165 enabling disabling 116 firmware upgrade 51 LEDs 47 231 replacing 49 116 upgrading EPROM 51 CPATH 103 D data parity checking 62 device not mounted message 29 diagnosing problems see troubleshooting disk array see Sun StorEdge T3 disk tray disk download command 72 disk drives assembly 166 firmware 70 hot spare 61 LEDs 63 monitoring 59 rebuilding 68 removing and replacing 64 repair 64 status 60 status codes 60 status messages 60 upgrading firmware 70 disk linkstat command 103 110 disk pathstat command 101 disk tray see Sun StorEdge T3 disk tray disk tray settings 135 door assembly 162 dot commands 95 E EPROM 9 51 error message type 174 Error severity level 3 error messages see messages Exception reset log type 191 F fail over determining 30 FAIL POLICY 103 failed FRU status 95 FC AL loop identifiers 219 FC AL loop problems see loop problems firmware controller 33 disk drive 33 EPROM
123. egabytes per second MB sec P parity partner group power and cooling unit pSOS R read caching reliability availability serviceability RAS 228 The main controller unit in a partner group configuration A unique address that identifies a storage location or a device An adapter that converts fiber optic light signals to copper One megabyte is equal to one million bytes 1x106 A performance measurement of the sustained data transfer rate Additional information stored with data on a disk that enables the controller to rebuild data after a drive failure A pair of interconnected controller units A FRU component in the Sun StorEdge T3 array The unit contains a power supply cooling fans and an integrated UPS battery A Sun StorEdge T3 array contains two power and cooling units A real time operating system used as the primary operating system for the Sun StorEdge T3 array Data for future retrieval to reduce disk I O as much as possible Product features that include high availability easily serviced components that are very dependable Sun StorEdge T3 Array Field Service Manual November 2002 redundant array of independent disks RAID S Simple Network Management Protocol SNMP synchronous dynamic random access memory SDRAM system area U uninterruptable power source UPS unit interconnect card UIC V volume A configuration in which mu
124. ence when the equipment is operated in a commercial environment This equipment generates uses and can radiate radio frequency energy and if it is not installed and used in accordance with the instruction manual it may cause harmful interference to radio communications Operation of this equipment in a residential area is likely to cause harmful interference in which case the user will be required to correct the interference at his own expense Shielded Cables Connections between the workstation and peripherals must be made using shielded cables to comply with FCC radio frequency emission limits Networking connections can be made using unshielded twisted pair UTP cables Modifications Any modifications made to this device that are not approved by Sun Microsystems Inc may void the authority granted to the user by the FCC to operate this equipment FCC Class B Notice This device complies with Part 15 of the FCC Rules Operation is subject to the following two conditions 1 This device may not cause harmful interference 2 This device must accept any interference received including interference that may cause undesired operation Note This equipment has been tested and found to comply with the limits for a Class B digital device pursuant to Part 15 of the FCC Rules These limits are designed to provide reasonable protection against harmful interference in a residential installation This equipment generates uses and can radiate rad
125. er Elektriker kann Ihnen die Daten zur Stromversorgung in Ihrem Geb ude geben Caution Achtung Sun Produkte sind f r haben die gleichen Nennwerte Herk mmliche im Haushalt verwendete Verlangerungskabel besitzen keinen Uberlastungsschutz und sind daher fiir Computersysteme nicht geeignet Caution Achtung Nicht alle Netzkabel Caution Achtung Ihr Sun Ger t wird mit einem dreiadrigen Netzkabel f r geerdete Netzsteckdosen geliefert Um die Gefahr eines Stromschlags zu reduzieren schlie en Sie das Kabel nur an eine fachgerecht verlegte geerdete Steckdose an Die folgende Warnung gilt nur f r Ger te mit Wartezustand Netzschalter dieses Ger ts schalten nur auf Wartezustand Stand By Modus Um die Stromzufuhr zum Ger t vollst ndig zu unterbrechen m ssen Sie die Netzkabel aus der Steckdose ziehen Alle Netzkabel m ssen ausgesteckt sein um die Stromverbindung zum Produkt zu unterbrechen Schlie en Sie die Stecker der Netzkabel an eine in der N he befindliche frei zug ngliche geerdete Netzsteckdose an i Caution Achtung Die Ein Aus Schalter Lithiumbatterie verf gen ber eine Echtzeituhr mit integrierter Lithiumbatterie Teile Nr MK48T59Y MK48TXXB XX MK48T18 XXXPCZ M48T59W XXXPCZ M4T28 XXYYSHZ oder MK48T08 Diese Batterie darf nur von einem qualifizierten Servicetechniker ausgewechselt werden da sie bei falscher Handhabung explodieren kann Werfe
126. er supply PS LED TABLE 7 1 lists the possible conditions of these LEDs with a description of each state Power and Cooling Unit LEDs Each power and cooling unit has an AC LED and a power supply PS LED TABLE 7 1 lists the possible conditions of these LEDs and describes each state AC LED PS LED 7 FIGURE 7 2 Power and Cooling Unit LEDs TABLE 7 1 Power and Cooling Unit LED Descriptions AC LED Green PS LED or Amber Green or Amber Description Off Off e Power is off No AC input Amber Off e Power is off e Power switch turned off AC power is available Green Off Occurs when array is shut down PCU disabled AC power is available Green Green Normal operating state e PCU receiving AC power Power switch is turned on AC power is available Chapter 7 Power and Cooling Unit Assemblies 83 84 TABLE7 1 Power and Cooling Unit LED Descriptions Continued AC LED Green PS LED or Amber Green or Amber Amber Amber Green Amber Green Blinking green Green Blinking amber Description Switch is off Array powers off after PCU is disabled Indicates one or more of following Over temperature condition PCU disabled DC power not available PCU disabled e Both fans fault PCU disabled Battery on refresh cycle e Battery not ready charging Indicates one or more of following e PCU disabled e One fan fault e Battery hold time low PCU remains enabled e Battery out of warranty PC
127. erify the repair by using the listed CLI status commands See Using CLI Diagnostic Commands on page 110 116 Sun StorEdge T3 Array Field Service Manual November 2002 8 If replacing the u1 controller card does not correct the problem replace the u2 RAID controller in the u2 enclosure 9 If replacing the two RAID controllers does not correct the problem proceed to replacing disk drives as described in the next section Off Line Drive Diagnostics and Replacement If replacing the interconnect and RAID controller cards does not resolve the loop 1 path 0 problem the next step is to test and if necessary replace any suspect disk drives The test to use is the ofdg off line diagnostic utility The ofdg diagnostic requires the array partner group to be removed from host access This is a highly disruptive procedure that stops all data access to the array Coordinate and schedule this down time with the customer To administer and monitor the test connect a serial maintenance cable and open a Tip session to the Sun StorEdge T3 array The following steps describe how to test for the above example of a suspected loop 1 path 0 problem Note Before running the ofdg utility all disks other than those located in the ul tray must be assigned to a LUN Problems may occur if ofdg is run on systems where non u1 disks have not been assigned to volumes 1 Make sure that all disks other than ul are assigned to a LUN 2 Quiesc
128. es the following advantages over the Telnet connection m Boot messages are displayed when the array boots The tftp boot configuration is available EPROM access is available Useful for debugging RARP IP address assignment issues Array specific troubleshooting commands can be issued to each controller in an enterprise configuration The status of the array unit can quickly be determined from the CLI The syslog file of the array file system contains a record of events that have occurred in the unit To start a serial connection and session with the array 1 Connect a serial cable from the serial port on the array master unit to any host system available serial port Note The serial cables used by the Sun StorEdge T3 arrays are different Both cables are supplied in the F370 4119 02 Diagnostic Kit The T3 array uses the serial cable with RJ 11 connectors and the T3 array uses the serial cable with RJ 45 connectors The serial port on the array is on the controller card backplane oOo oOo s WD A DP ioo bo po eo LlLl oo Serial port FIGURE 2 1 Serial Port Location 2 On the host system open a terminal window type tip the baud rate and the serial port designation For example mymachine tip 9600 dev ttya connected Password Invalid name Login root Password T3B Release 2 00 2001 04 02 15 21 29 192 168 209 243 Copyright C 1997 2001 Sun M
129. et network The CLI can then be run from any host that can access the array subnetwork The advantages that a Telnet connection provides over a serial port connection are as follows m You can have multiple windows open for each array Chapter 2 Connecting to the Sun StorEdge T3 Array 9 The Telnet connection provides a faster interface than the serial port connection which can be useful for displaying syslog information You can quickly determine the status of the array unit from the CLI The syslog file on the array file system contains a record of events that have occurred in the unit and can also be examined through the CLI To open a Telnet connection and start a session with the array On the management host use the telnet command with the array name or IP address to connect to the array For example to telnet to a array named T3 1 mgmt host telnet T3 1 Trying 41231253123 Connected to T3 1 Escape character is Telnet session 123 123 123 1 Login root Password passwd T3B Release 2 1 2002 04 02 15 21 29 192 168 209 243 Copyright C 1997 2001 Sun Microsystems Inc All Rights Reserved 3 2515 where password is the root password 2 Verify the array has a root password by typing it at the prompt If no root password is set on the system press Return at the password prompt to enter the CLI Use the password command to establish a password 3 To view the available command
130. etry in which they were previously created Ignore any SVD_PATH_FAILOVER or SVD_CHECK_ERROR messages that occur voll unmounted vol2 unmounted T3 lt 6 gt vol add voll data uld1 8 raid 5 standby uld9 T3 lt 8 gt vol add vol2 data u2d1 8 raid 5 standby u2d9 T3 lt 9 gt vol stat uldl uld2 uld3 uld4 uld5 uld6 uld7 uld8 uld9 0 0 0 0 0 0 0 0 0 u2dl u2d2 u2d3 u2d4 u2d5 u2d6 u2d7 u2d8 u2d9 0 0 0 0 0 0 0 0 0 44 Sun StorEdge T3 Array Field Service Manual November 2002 16 Fast initialize the array volumes by typing 17 Mount the array volumes T3 lt 10 gt WARNING Continue TS 815 WARNING Continue Existing volume data won t be changed Existing volume data won t be changed vol init voll fast N y vol init vol2 fast N y voll mounted vol2 mounted T3 lt 12 gt vol T3 lt 13 gt vol T3 lt 14 gt vol stat uldl uld2 uld3 uld4 uld5 uld6 uld7 uld8 uld9 0 0 u2dl u2d2 u2d3 u2d4 u2d5 u2d6 u2d7 u2d8 u2d9 0 0 mount voll mount vol2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 18 19 20 21 existed Enable volume slicing if applicable and restore the slices as they previously Restore the LUN masking settings on the volume slices as applicable Verify that the application host can access the restored array LUNs by typing luxadm probe Rescan the devices with Volume Manager if applicable by typing vxdctl enab
131. g a Disk Drive 7 Release the latch handle on the disk drive to be installed 8 Insert the new disk drive gently on the middle of the rails and push it in until it is seated with the centerplane connector Use a coin or small screwdriver to press in and lock the latch handle 9 Replace the front panel Note Replace the front panel for the array to meet FCC compliance requirements 10 Type fru list umdm to verify the firmware revision of the new disk drive where m un is the unit u number n m dn is the drive d number n See Check the drive status to ensure that the reconstruction of the replaced drive FRU has completed on page 70 for instructions if necessary Chapter 5 Disks and Drives 67 68 Rebuilding a Replaced Drive A replaced drive should begin to rebuild itself automatically Note If a standby drive is configured data is not copied back from the hot spare to a newly replaced data drive until the reconstruction of data to the hot spare from parity is completed This means that you might not see any activity lights immediately after replacing a drive If automatic reconstruction does not start or fails begin the rebuild of the replaced drive FRU manually as follows On the host use the telnet command with the array name or IP address to connect to the array mngt_host telnet array name Trying 129 150 47 101 Connected to 129 150 47 101 Escape character is Tel
132. g to sort out Sun StorEdge T3 array related loop problems or interpret the Sun StorEdge T3 array syslog it is important to have the data host messages file available If you are troubleshooting a live array you should always enable remote syslogging and monitor the host messages and array messages at the same time A laptop and ethernet hub come in handy here It is also important to verify that the time and date are the same on both the arrays and the data hosts The following commands are also useful for finding all the targets and initiators on the loop in question luxadm e port luxadm e dump map device where device is from the output of the previous command To find the targetids and WWNs of the array ports use lt n gt port list lt n gt port listmap There is a table of internal alpa target mapping in the appendix of this document Armed with this information you should be able to sort out who is who and what is what 176 Sun StorEdge T3 Array Field Service Manual November 2002 When debugging it is also useful to reset the syslog on the array and the remote syslog host to clear out any noise from earlier testing problems or the initial install lt n gt set logto 1 lt n gt mv syslog syslog bak lt W gt logger Starting New Syslog xx xx xxxx gt syslog lt n gt set logto The Basic Message Format May 18 16 36 08 FCCO 1 N ulctr ITL 7D 1 0 TT
133. g unit and monitor the UPS Chapter 8 describes how to diagnose and correct back end FC AL drive loop problems with the Sun StorEdge T3 array Chapter 9 describes how to replace the chassis backplane assembly Chapter 10 describes how to reconfigure the Sun StorEdge T3 array into partner groups and single controller units Appendix A contains part numbers and illustrations of field replaceable units Appendix B lists the Sun StorEdge T3 array defaults Appendix C contains a description of the messages that can be reported by the array Appendix D contains descriptions of the commands supported by the Sun StorEdge T3 array Appendix E lists the FC AL loop identified by AL_PA switch and setting values Appendix F contains a blank worksheet for the qualified service provider to make notes at each customer site xxviii Sun StorEdge T3 Array Field Service Manual November 2002 Using UNIX Commands This document contains some information on basic UNIX commands and procedures such as booting the devices For more information outside of this document see the following m AnswerBook2 online documentation for the Solaris software environment m Other software documentation that you received with your system Preface xxix Typographic Conventions TABLE P 1 Typographic Conventions Typeface Meaning Examples AaBbCc123 The names of commands files and directories on screen computer output AaBbCc123 Wh
134. ge 169 System Defaults on page 170 Volume Defaults on page 171 Default Directories and Files on page 172 Boot Defaults Specify boot defaults with the set command When run without any parameters the set command displays the current values See the Sun StorEdge T3 Array Administrator s Manual for information on using the set command TABLE B 1 Default Settings set List Parameter Default Variables bootmode auto auto tftp none bootdelay 3 Number of seconds sn Number Serial number ip nnnn Unit IP address netmask 255 255 255 0 Unit netmask gateway n n n n Network gateway IP address tftphost nnnn IP address of TFTP server 169 TABLE B 1 Default Settings set List Parameter Default Variables tftpfile value Boot code file identification number 39 character maximum hostname machinename Machine name of the Sun StorEdge T3 host machine 39 character maximum vendor vendorname Name of manufacturer or vendor model modelnumber Controller model number set at EP level revision Onnn Controller EP revision EP writes this value logto 1 filename where 1 Forces logging to serial console Directs logging daemon to direct logging as specified in the etc syslog conf file loglevel 3 011121314 where 0 No logging at all 1 Error messages only 2 Warning and higher messages 3 Notice and higher messages 4 All message levels including info rarp on on off mac n n n n n n Controller M
135. ge Automated Diagnostic Environment message monitoring was run on that host and scanned the log looking for array log file messages of a warning or error class The data in the example above indicates that drives u1d4 9 in the ulctr controller completed a path failover from loop 1 path 0 to loop 2 path 1 This means that a hard failure or a threshold count was exceeded on the u111 loop At this time drives u1d4 9 are being serviced by the ulctr only through the u112 loop This is a good indication that there has been some kind of failure in the u1l1 interconnect card the ulctr controller or one of the u1d1 9 drives Manual Examination of the syslog File If Storage Automated Diagnostic Environment message monitoring is not running the Sun StorEdge T3 array CLI interface can be used to examine the unit s syslog Use either the cat or more command on the log file Either command outputs the complete log to the Telnet or Tip session screen Alternatively you can ftp the syslog file to the telnet or tip host and examine it with a text editor capable of Sun StorEdge T3 Array Field Service Manual November 2002 performing text searches with a character match In the case of the example shown above a search would be done for the error message type field of a W Such a search might display data similar to the following Mar Mar Mar Mar Mar Mar 07 07 07 07 07 07 18 18 18 18 182 18 33 33 334 33 33 33
136. gnostic Functions and Tools 19 Firmware Status Indicators 29 Sun StorEdge T3 Array Controller Card LED Descriptions 48 Channel Active LED Descriptions 55 Drive Status Messages 60 Disk Drive LED Descriptions 63 Interconnect Card LED Descriptions 76 Power and Cooling Unit LED Descriptions 83 Sun StorEdge T3 Array Assemblies 161 Door Assembly 162 Interconnect Card Assembly 163 Power Supply 164 Controller Card 165 Drive Assembly 166 Cable and Interconnect Assemblies 168 Default Settings set List 169 System Default Settings 170 Volume Defaults 171 XXV xxvi TABLE B 4 TABLE C 1 TABLE C 2 TABLE C 3 TABLE C 4 TABLE C 5 TABLE C 6 TABLE C 7 TABLE C 8 TABLE C 9 TABLE C 10 TABLE C 11 TABLE C 12 TABLE D 1 TABLE D 2 TABLE E 1 TABLE F 1 Default Directories and Files 172 Message Types 174 FRU Identifers 175 LIDs corresponding to LUN IDs example 183 Reset Log Message Types 191 Reset Log Messages 192 Boot Message Acronyms 193 Firmware Status Boot Messages 195 Internal Sun StorEdge T3 Array AL_PA LID LOOP Map 203 SVD Disk Error Definitions 204 Stripe Type Messages 205 SCSI Command Set 207 Arbitrated Loop Physical Addresses and Loop IDs 209 Commands Listed in Alphabetical Order 216 FRU Identifiers 217 Assigned Loop Identifier 219 Sun StorEdge T3 array Information Worksheet 222 Sun StorEdge T3 Array Field Service Manual November 2002 Preface The Sun StorEdg
137. gt set bootmode auto 13 Reset the array by typing T3 lt 3 gt reset y Verify the system boots normally by observing the console m If you have a workgroup configuration proceed to Step 14 m If you have an enterprise configuration continue below i Shutdown the array by typing T3 lt 4 gt shutdown y ii Power off the array iii Re attach the interconnect cables iv Power on both arrays of the enterprise configuration v Verify the systems boot normally by observing the console of each controller Chapter 3 Diagnosing T3 Array Problems 43 14 Verify that the system parameters are set correctly by typing 15 T3 lt 5 gt sys list blocksize 16k cache off mirror off mp_support none naca SOFE rd_ahead Son recon_rate med sys memsize 128 MBytes cache memsize 1024 MBytes enable_volslice y n fc_topology auto Caution e Failure to ensure that the blocksize is set correctly will lead to data loss or corruption Failure to ensure multipathing support is enabled will prevent proper LUN failover in an enterprise configuration Failure to restore volume slices on the correct blocks will cause data loss or corruption Failure to ensure LUN masking is properly restored can result in data inaccessibility on the desired host or result in improper access from undesired hosts If the volume information was lost add the array volumes using the same geom
138. hat a write once device or a sequential access device encountered blank medium or format defined end of data indication while reading or a write once device encountered a non blank medium while writing OxA COPY ABORTED This indicates a COPY COMPARE or COPY AND VERIFY command was aborted due to an error condition on the source device the destination device or both 0x7 DATA PROTECT This indicates that a command that reads or writes the medium was attempted on a block that is protected from this operation The read or write operation is not performed OC EQUAL This indicates a SEARCH DATA command has satisfied an equal comparison Appendix C Sun StorEdge T3 Array Messages 211 0x4 HARDWARE ERROR This indicates that the target detected a non recoverable hardware failure for example controller failure device failure parity error etc while performing the command or during a self test 0x5 ILLEGAL REQUEST This indicates that there was an illegal parameter in the command descriptor block or in the additional parameters supplied as data for some commands FORMAT UNIT SEARCH DATA etc If the target detects an invalid parameter in the command descriptor block then it shall terminate the command without altering the medium If the target detects an invalid parameter in the additional parameters supplied as data then the target may have already altered the medium This sense key may also indicate that an invalid IDENTIFY
139. he var adm messages file Identify the failing array by decoding the messages In the failing array are any errors indicated in the array syslog file NO YES Run Storage Automated Diagnostic Environment Does it pass Decode errors and replace failed component in the array Connect loopback Check Intermittent Loop plug to HBA and GBIC MIA and re run test fiber cable Does it pass Does problem persist Check GBIC MIA and fiber cable Does problem persist Replace HBA and rerun test to verify that problem is fixed Change Raid Controller in the array FIGURE 3 1 Data Connection Troubleshooting Flow Chart 22 Sun StorEdge T3 Array Field Service Manual November 2002 Unable to Telnet to the array from the same subnet Have you ever been able to access this array through the network YES Ensure the network cables are properly connected On Admin host is the correct ENET address in etc ethers file and correct IP address in the etc hosts file Verify that etc nsswitch conf file has ethers and hosts entries before NOTFOUND return Fix network cable and reset the array Can it be accessed NO YES Correct the files and restart rarpd as follows ps eaf grep rarpd Kill the PID and restart with usr sbin in rarpd a Hook up a console serial cable to the array and
140. he FRUs Remove plastic vinyl and foam from the work area Before handling a FRU discharge any static electric charge by touching a ground surface Wear an antistatic wrist strip Do not remove a FRU from its antistatic protective bag until you are ready to install it When removing a FRU from the array immediately place it in an antistatic bag and packaging Handle a FRU only by its edges and avoid touching the circuitry Do not slide a FRU over any surface Limit body movement which builds up static electricity during FRU installation Chapter 1 Troubleshooting Overview 5 6 Sun StorEdge T3 Array Field Service Manual November 2002 CHAPTER 2 Connecting to the Sun StorEdge T3 Array This chapter describes how to connect to the Sun StorEdge T3 array and contains the following sections a Establishing a Serial Port Connection on page 7 a Establishing a Telnet Session on page 9 a Establishing an FTP Session on page 12 m Using tftpboot to Boot a Single Array or a Partner Group Remotely on page 13 Configuring a Server for Remote Booting on page 16 m Setting Up Remote Logging on page 17 Establishing a Serial Port Connection The serial port is a direct connection to the array from any serial port on any host or system Individual commands can be run to query and repair the unit from this interface using the command line interface CLI The serial port connection provid
141. he second level code The second level code initializes memory and loads itself to RAM locations starting from 0x500000 m The second level code can allow tftp boot or ROM boot for the third level code In ROM boot the second level code selects one of the two copies of the third level RAID application code The second level code loads the RAID application code to RAM locations from 0x20000 m The third level is the RAID application m The extended Power On Self Tests post code is for performing factory level diagnostics First Level Boot Code The level 1 boot code starts at 0xFFF00100 which is the processor s reset vector The first level code initializes the MPC107 bridge chip and the console serial port It prints T3B when the initialization is done Then it waits about 1 5 seconds to allow the user to select one of the two copies of level 2 code to boot The user can type 1 or 2 but there is no echo for the character typed If the user makes no selection level 2 code does the selection automatically In the automatic selection the level 1 code verifies the level 2 boot code stored in ROM It finds which one is newer and jumps to the selected code If the user has entered the selection before automatic selection level 1 code jumps to the one the user has selected after verifying the code is valid If the user selects an invalid copy then level 1 code jumps to the valid one instead of the user selected one After the level 1 c
142. hen all LEDs are green proceed to the next step 22 Check the LEDs at the front and back of the unit to ensure that all components are receiving power and are functional While the drives are spinning up the LEDs blink The array boot time take up to several minutes after which all LEDs should be solid green indicating that the unit is receiving power and there is no drive activity Note The batteries in the power and cooling units recharge after powering on the unit While the batteries are recharging write behind cache is disabled Note If the green power and cooling unit LEDs on connected units do not light press the power switches on those units 23 Use the CLI to verify that all components are functioning properly To verify status using the CLI open a Telnet session to the disk array and verify volume and FRU status as described Checking FRU Status on page 35 lt 1 gt fru stat lt 2 gt vol stat Note that when the backplane is replaced the data host volume s WWN changes The WWN is derived from the backplane serial number Since the volume WWN is part of the volume s device path on the data host the device path definition on the data host changes Therefore you must reconfigure the data host to recognize the new WWNs 132 Sun StorEdge T3 Array Field Service Manual November 2002 24 Configure the data host to recognize the new WWNs by executing the following command on
143. icrosystems Inc All Rights Reserved lt l gt 8 Sun StorEdge T3 Array Field Service Manual November 2002 If the Sun StorEdge T3 array is being booted the following message is displayed auto boot is enabled hit the RETURN key within 3 seconds to cancel In a boot situation if the Return key is pressed within 3 seconds the array stops booting and the EPROM takes control of the array If the Return key is not pressed the array continues to boot Note that in a partner group the alternate master unit continues to boot and appears as the master if the boot sequence of the master is stopped The following commands are available for use at the EPROM level m boot m reset m set m id Once the array has fully booted all the commands available through the CLl are accessible Note Use the login prompt to set the IP address netmask and hostname instead of using the EP prompt Setting these parameters at the EPROM level will be lost For more information on serial connections see m Sun StorEdge T3 Array Administrator s Manual for instructions on setting up remote logging m Checking Array Boot Status on page 27 Establishing a Telnet Session The Telnet session is a direct network link to the array unit through the command line interface CLI You can execute individual commands to query and repair the unit from this interface The Telnet session requires access to the unit s Ethern
144. ield Service Manual November 2002 UPS Battery The uninterruptible power supply UPS battery is located within the power and cooling unit The battery provides backup in case of a complete AC power failure and sustains power to the array long enough to flush cache data to the drives When a Sun StorEdge T3 array is first powered up write behind caching is disabled cache runs in write through mode for a short time The write behind caching is disabled during cold boots even if AC power has not been removed from the array as the firmware attempts to determine the condition of the internal PCU batteries Once the system determines that the batteries are in an optimal state system cache mode returns to write behind After a power down a array re enables write behind cache mode in approximately two hours During a power failure if the battery is flushing cache data to the drives battery power becomes depleted Once AC power is available the battery recharges While the battery is recharging write behind cache mode is disabled and write through cache mode is enabled until the battery is fully recharged The battery recharge could take up to 12 hours depending on the length of the power outage and the amount of cache data that was flushed to the drives Note The batteries in the power and cooling units recharge after powering on the array If the batteries are less than fully charged fru stat output will display batteries in a fau
145. ield Service Manual November 2002 Testing data Initia Initia Initia Initia Initia Mounti Checki Initia ng ng liz lizi lizi lizi lizi lizi ng ng ng ng ng ing cache memory Passed Cache Memory system DB structure configuration port configuration loop 2 to accept SCSI commands root volume local file system network routes Read PGR data Done Starting Syslog Daemon Waiting for 1 slave controller s to come up ul Configuring local data u2 Initializing drives System has 1 active controller s Initializing IFTB Starting ftpd Starting telnetd Starting timed Starting pshd Starting httpd Starting snmpd Starting schd Checking disk positions Initializing host port ulpl ISP2200 firmware status Host port ulpl TARGET_ID Oxffff ALPA 0x5 Starting psh Login Appendix C Sun StorEdge T3 Array Messages 7 199 Sun StorEdge T3 Array Enterprise Configuration as seen from the Alternate Master Controller T3B 2 Starting POST POST end Starting T3B EP Release 2 01 2002 07 30 16 33 52 129 150 28 80 Copyright C 1997 2002 Sun Microsystems Inc All Rights Reserved Found units ul ctr u2 ctr auto boot is enabled hit the RETURN key within 3 seconds to cancel Starting T3B Release 2 01 2002 07 30 15 21 29 129 150 28 80 Copyright C 1997 2002 Sun Micro
146. in SNXF_EXE 4002 Short non transfer execution mode sense select SNXF_OUT 4003 Short non transfer out LNXF_IN 4004 Long non transfer in LNXF_EXE 4005 Long non transfer execution i e format command LNXF_OUT 4006 Long non transfer out XFR_IN 4007 Transfer in XFR_EXE 4008 Transfer execution ie read or write XFR_OUT 4009 Transfer out TAKEOVER_FAIL 5000 NO_RESP 5001 NO_RESP1 5001 Detected by CPU1 NO_RESP2 5002 Detected by CPU2 OS_FAIL 6000 Operating System Failure SYSFAIL 7000 System Fatal Error CBUF_PARITY 7001 Cache Buffer Detected Parity Error CBUF_SERR 7002 192 Sun StorEdge T3 Array Field Service Manual November 2002 Boot Messages Boot messages can be extremely useful in troubleshooting situations The following are examples of standard boot messages on Sun StorEdge T3 arrays having no failures Typical boot messages for the array workgroup and enterprise configurations appear below for reference This section consists of the following components m Section Interpreting Boot Messages on page C 193 a Section Boot Message Acronyms on page C 193 a Section Boot Message Bracket Placement on page C 194 a Section Detecting FC AL Ports and Reporting Firmware Status on page C 194 a Section Sun StorEdge T3 Array Workgroup Configuration on page C 195 a Section Sun StorEdge T3 Array Enterprise Configuration on page C 198 a Section Sun StorEdge T3 Array Enterprise Configuratio
147. in the host data channel are outside of the scope of the Sun StorEdge T3 array To determine failures in the data path use the Storage Automated Diagnostic Environment Refer to the documentation of the selected diagnostics tool for information on identifying data channel failures Chapter 3 Diagnosing T3 Array Problems 39 Reserved System Area Recovery Procedure Some of the conditions that indicate a corrupted system area of a Sun StorEdge T3 array are m The controller is disabled or the booting process is cycling m The command line prompt cannot be accessed using either the Ethernet or a serial interface m The application host cannot communicate with the LUNs Note After configuring a system always record the following data to prepare for the possibility of having to perform a recovery procedure Array block size Multipathing settings Volume configuration Volume slicing configuration LUN masking settings Recovery Procedure 1 Establish a serial port connection to each Sun StorEdge T3 array See Establishing a Serial Port Connection on page 7 2 Stop the application and unmount the file systems on the application host for the LUNs defined on the array s that are being recovered unmount filesystem 3 Power off the affected array s by pushing both power switches on the PCUs 4 Clear the controller disable flags by partially removing all Interconnect cards half way out for 30 s
148. ing a battery refresh cycles at a particular time edit the BAT_BEG MM DD YYYY hh mm ss value in the etc schd conf file Where MM is the month number January 1 DD is the day number YYYY is the year hh is the hour using a 24 hour clock 6pm 18 mm is the minute ss is the second this element is optional Caution The battery service life dependents on a battery refresh cycle of 28 days Altering this time span can decrease battery life and should only be done as directed by Sun representatives Note that the next refresh start time is always calculated from the start time of the previous refresh cycle If a user manually starts a refresh cycle then the next refresh depends on the starting time of the manually activated refresh cycle Chapter 7 Power and Cooling Unit Assemblies 89 AN Note If a controller failover occurs the scheduler daemon starts and behave as it does during a normal system boot The scheduler reads the schd log file and based on schd conf file begins the next refresh process If during the discharge period 6 minutes or recharge period 6 to 12 hours a controller failover occurs the current refresh process is killed and the next refresh cycle starts at the scheduled refresh time based on the schd conf file Consequently the refresh cycles begins as scheduled previously The battery service life is 2 years When the battery approaches its end of life warning messages are sent to the syslog fi
149. ing of the code file is needed For example the code file may be zipped when it needs to be unzipped before uploading to RAM hdr_cksum This is the checksum for the header portion Since the ep command will change hdr_counter when downloading code into ROM this field will be updated accordingly by the ep command Currently only code_signature and hdr_counter affects the automatic selection done by level 1 code or level 2 code Of course code_cksum and hdr_cksum are used to validate the code Level 1 Controller Firmware This procedure upgrades the level 1 firmware in only one controller Therefore you must perform it twice for each array enterprise configuration 1 Connect a console to the serial port the right RJ 45 port of the array See Establishing a Serial Port Connection on page 7 Note The serial cables used by the Sun StorEdge T3 and T3 arrays are different Both cables are supplied in the F370 4119 02 Diagnostic Kit The T3 array uses the serial cable with RJ 11 connectors and the T3 array uses the serial cable with RJ 45 connectors 2 Set up the tftp host See Configuring a Server for Remote Booting on page 16 56 Sun StorEdge T3 Array Field Service Manual November 2002 Reset the controller and press any key on the console when the system prompt appears within three seconds Set the bootmode and tftp settings as follows EP gt set bootmode tftp EP gt set tftphost XX
150. ins For example lt 3 gt volume vol2 lt 4 gt volume vol2 vol list vol stat capacity raid data standby 236 058 GB 5 u2d1 8 none capacity raid data standby 236 058 GB 5 u2d1 8 u2d9 152 Sun StorEdge T3 Array Field Service Manual November 2002 4 Use the fru list and fru stat commands to verify that the array is functional and ready for operation For example lt 5 gt fru list ID TYPE VENDOR MODEL REVISION SERIAL ulctr controller card 0301 501 5710 02 020100 020101 112035 u2ctr controller card 0301 501 5710 02 020100 020101 112122 uldl disk drive SEAGATE ST336704FSUN A726 3CD1HMKJ uld2 disk drive SEAGATE ST336704FSUN A726 3CD1HH2A uld3 disk drive SEAGATE ST336704FSUN A726 3CD1H9WS uld4 disk drive SEAGATE ST336704FSUN A726 3CD1HM64 uld5 disk drive SEAGATE ST336704FSUN A726 3CD1HMC2 uld6 disk drive SEAGATE ST336704FSUN A726 3CD1HM63 uld7 disk drive SEAGATE ST336704FSUN A726 3CD1HE3A uld8 disk drive SEAGATE ST336704FSUN A726 3CD1HNKO uld9 disk drive SEAGATE ST336704FSUN A726 3CD1HM5P u2d1 disk drive SEAGATE ST336704FSUN A726 3CD1HHH5 u2d2 disk drive SEAGATE ST336704FSUN A726 3CD1HMJC u2d3 disk drive SEAGATE ST336704FSUN A726 3CD1HGKR u2d4 disk drive SEAGATE ST336704FSUN A726 3CD1HLBJ u2d5
151. io frequency energy and if not installed and used in accordance with the instructions may cause harmful interference to radio communications However there is no guarantee that interference will not occur in a particular installation If this equipment does cause harmful interference to radio or television reception which can be determined by turning the equipment off and on the user is encouraged to try to correct the interference by one or more of the following measures e Reorient or relocate the receiving antenna e Increase the separation between the equipment and receiver e Connect the equipment into an outlet on a circuit different from that to which the receiver is connected e Consult the dealer or an experienced radio television technician for help Shielded Cables Connections between the workstation and peripherals must be made using shielded cables in order to maintain compliance with FCC radio frequency emission limits Networking connections can be made using unshielded twisted pair UTP cables Modifications Any modifications made to this device that are not approved by Sun Microsystems Inc may void the authority granted to the user by the FCC to operate this equipment ICES 003 Class A Notice Avis NMB 003 Classe A This Class A digital apparatus complies with Canadian ICES 003 Cet appareil num rique de la classe A est conforme la norme NMB 003 du Canada ICES 003 Class B Notice Avis NMB 003 Classe B This
152. isk tray This identifier contains a unit constant u the unit number n the FRU constant ctr for controller card pcu for power and cooling unit 1 for interconnect card d for disk drive and the FRU number n TABLE C 2 FRU Identifers FRU Identifier Unit Number Controller card unctr n unit number 1 2 Power and cooling unit unpcun n unit number 1 2 n pcu number 1 2 Interconnect card unln n unit number 1 2 n interconnect number 1 2 Disk drive undn n unit number 1 2 n disk drive number 1 to 9 Miscellaneous Abbreviations LPC Loop card BATD Battery monitor IPI 3 Intelligent Peripheral Interface Similar legacy protocol to SCSI the Sun StorEdge T3 array uses IPI 3 for configuration data TDL Transaction disk log CCB Command Control Block SCB Stripe Control Block IOCB ISP2100 IO Control Block Basically a request put into the queue for the ISP to process IOSB ISP2100 Status Block SVD SCSI Virtual Disk Driver This driver is the backend disk driver in the T3 SVH SCSI virtual host driver The front end Sun StorEdge T3 array driver which takes host requests for ISP2100 in target mode XPT SCSI Transport Layer module in Sun StorEdge T3 array driver stack Appendix C Sun StorEdge T3 Array Messages 175 SID Stripe ID STYPE Stripe type ISR Interrupt service routine Interpreting Sun StorEdge T3 Array syslog Messages When attemptin
153. isting single controller units This procedure includes the following sections Preparing the Arrays on page 149 Establishing a New IP Address on page 151 Establishing a Network Connection on page 152 Use the vol list and vol stat commands to verify that the phantom volume has been deleted and that the existing volume remains on page 152 Preparing the Arrays Back up all data on the partner group Caution Make sure you back up data before proceeding Ensure that the data path between the host and the partner group has been quiesced There must not be any I O activity Start a Telnet session with the master unit a On the host use the telnet command with the array name or IP address to connect to the array telnet array_name Trying 129 150 47 101 Connected to 129 150 47 101 Escape character is Telnet session 129 150 47 101 Chapter 10 Hardware Reconfiguration 149 b Log in to the array by typing root and your password at the prompts The array prompt is displayed 4 View a listing and the status of the volumes lt 1 gt vol list volume capacity raid data standby voll 236 058 GB 5 uld1 8 uld9 vol2 236 058 GB 5 u2d1 8 u2d9 lt 2 gt vol stat volume capacity raid data standby voll 236 058 GB 5 uld1 8 uld9 vol2 236 058 GB 5 u2d1 8 u2d9 5 Unmount voll lt l gt vol unmount voll 6 Remove voll lt 1 gt vol remove voll
154. ive Temperature 62 Disk Drive LEDs 63 Repairing Disk Drives 64 Removing and Replacing a Disk Drive 64 Rebuilding a Replaced Drive 68 Upgrading Disk Drive Firmware 71 Interconnect Card Assemblies 75 Interconnect Card LEDs 76 Removing and Replacing an Interconnect Card 77 Upgrading Interconnect Card Firmware 79 Power and Cooling Unit Assemblies 81 Power and Cooling Unit 81 Power and Cooling Unit LEDs 83 Power and Cooling Unit LEDs 83 Removing and Replacing a Power and Cooling Unit 85 UPS Battery 87 Checking the Battery 87 Battery Maintenance 89 Removing and Replacing the UPS Battery 90 Remove the UPS Battery 90 Replace the UPS Battery 94 Contents xix 8 Diagnosing and Correcting FC AL Loop Problems 95 Overview 95 Normal Status 96 The fru stat Command 98 The vol mode Command 99 The port listmap Command 100 The loop stat Command 101 The disk pathstat Command 101 The disk linkstat Command 103 Diagnosing an FC AL Loop 105 FC AL Loop Problem Indicators 106 Checking Performance Against Baseline Data 107 Storage Automated Diagnostic Environment Message Monitoring 108 Manual Examination of the syslog File 108 Example syslog Error Messages 109 Using CLI Diagnostic Commands 110 Using the ofdg Diagnostic Utility 111 The health_check Option 113 The ofdg fast_test Option 113 The ofdg fast_findOption 114 The ofdg findOption 114 Repair Procedures 115 Interconnect Card Replacement Procedure 115 RAID Controller Replacement Procedure 1
155. k ready ready 29 vol3 uld7 ready enabled data disk ready ready 34 vol3 uld8 ready enabled data disk ready ready 37 vol3 uld9 ready enabled data disk ready ready 31 vol3 u2dl ready enabled data disk ready ready 33 vol2 u2d2 ready enabled data disk ready ready 37 vol2 u2d3 ready enabled data disk ready ready 35 vol2 u2d4 ready enabled data disk ready ready 37 vol2 u2d5 ready enabled data disk ready ready 34 vol2 u2d6 ready enabled data disk ready ready 35 vol4 u2d7 ready enabled data disk ready ready 35 vol4 u2d8 ready enabled data disk ready ready 40 vol4 u2d9 ready enabled data disk ready ready 36 vol4 LOOP STATUS STATE MODE CABLE1 CABLE2 TEMP u211 ready enabled master installed 29 0 u212 ready enabled slave installed 30 5 ulll ready enabled master installed 29 5 ull2 ready enabled slave installed 30 0 POWER STATUS STATE SOURCE OUTPUT BATTERY TEMP FAN1 FAN2 ulpcul ready enabled line normal normal normal normal normal ulpcu2 ready enabled line normal normal normal normal normal u2pcul ready enabled line normal normal normal normal normal u2pcu2 ready enabled line normal normal normal normal normal 98 Sun StorEdge T3 Array Field Service Manual November 2002 The vol mode Command The vol mode command returns the current cache mode A cache status other than writebehind might indicate loop problems CODE EXAMPLE 8 1 vol mode Command Normal Ouputs lt 2 gt vol mode volume mounted writebehind mirror voll yes wri
156. ld4 ready enabled data disk ready ready 29 voll uld5 ready enabled data disk ready ready 29 voll uld6 ready enabled data disk ready ready 29 vol3 uld7 ready enabled data disk ready ready 34 vol3 uld8 ready enabled data disk ready ready 37 vol3 uld9 ready enabled data disk ready ready 32 vol3 13 Remove and replace the suspect disk drive from the enclosure See Repairing Disk Drives on page 64 The drive spins up and the sysarea data copies to it from another drive in the ul enclosure After the copy is complete a volume reconstruction starts 14 Rerun the ofdg find diagnostic through the suspect loop as described in Step 10 and Step 11 Once the test completes examine and compare the two outputs to insure that the fault has been corrected a If the problem is resolved proceed with Step 16 through Step 15 122 Sun StorEdge T3 Array Field Service Manual November 2002 b If the problem is not resolved proceed with Step 16 through Step 15 and then replace the backplane chassis See Chassis Replacement Procedure on page 123 and Replacing the Chassis Backplane Assembly on page 126 15 Remount the volumes lt 14 gt vol mount voll lt 15 gt vol stat voll uldl uld2 uld3 uld4 uld5 mounted 0 0 0 0 0 16 Restart the volume reconstruction with the vol recon command on the replaced disk drive lt 16 gt vol recon uld9 17 Reconnect the fibre optic cable to the MIAs 18
157. le 22 Check the file systems on the appropriate LUNs by typing 23 fsck filesystem Mount the file systems and restart the application by typing mount filesystem Chapter 3 Diagnosing T3 Array Problems 45 46 24 25 26 27 28 29 30 Create a syslog conf file with the correct remote and local logging entries Upload it to the array by using ftp and place it in the etc directory Restart the Sun StorEdge T3 array syslog daemon T3 lt 15 gt set logto Use the logger command to verify the system is logging properly by typing T3 lt 16 gt logger message where message is the text of a test message to be logged Create a schd conf file with the correct BAT_BEG date and 28 day BAT_CYC Refer to Note that the next refresh start time is always calculated from the start time of the previous refresh cycle If a user manually starts a refresh cycle then the next refresh depends on the starting time of the manually activated refresh cycle on page 89 Make sure to specify a future start date to preclude a refresh during the recovery Restart the battery scheduler Verify the battery scheduler is working as expected by typing T3 lt 16 gt refresh i T3 lt 17 gt refresh s Exit from the serial console session on each controller Sun StorEdge T3 Array Field Service Manual November 2002 CHAPTER 4 Controller Card Assembly
158. le The first message is sent 45 days before the end of life followed by a warning message every five days thereafter The power and cooling unit must be replaced within forty five days of receiving the first warning message The warning message indicates which power and cooling unit needs to be replaced After the battery service life expires the cache is forced to write through mode Removing and Replacing the UPS Battery This section covers a cold swap procedure Note Eventhough the T3 documentation targeted for the customers suggests replacing the PCU to replace the battery trained Sun Field personnel who have access to the Sun StorEdge Field Service Manual may follow the procedure shown below to replace only the battery inside the PCU Remove the UPS Battery Remove the PCU from the array See Removing and Replacing a Power and Cooling Unit on page 85 Caution Any power and cooling unit that is removed must be replaced within 30 minutes or the Sun StorEdge T3 array and all attached arrays automatically shut down and power off Turn the PCU over such that the bottom of the unit is facing up as shown in FIGURE 7 4 90 Sun StorEdge T3 Array Field Service Manual November 2002 FIGURE 7 4 Turning the PCU upside down Remove the four Phillips screws from the panel on the bottom and side of the PCU as shown in FIGURE 7 5 Use care in removing the screws so they do not fall into the vent holes of the
159. lishing a Serial Port Connection on page 7 Boot the array There are several ways to initiate a boot cycle m Power off the Sun StorEdge T3 array and power it on again m Log into a array and issue a reset command m Log into a array and issue a shutdown command this reguires a power cycle to get the system to start booting m If the array is already running you can reboot by issuing a boot command with options Chapter 3 Diagnosing T3 Array Problems 27 Screen messages similar to the following appear gt tip Connected Copyright All Right Found uni auto boot Detecting Detecting Found 18 Trying to Executing Starting password 9600 dev ttyb T3 EP Release 2 01 2002 3 8 13 05 27 IP Address of tray c 1997 1999 Sun Microsystems Inc s Reserved ts ul ctr1 u2 ctr is enabled hit the RETURN key within 3 seconds to cancel Default master is 1 Default alternate master is 2 Initializing System Drives Initializing XPT Components Initializing QLCF Components Initializing Loop 1 ISP2100 firmware status 3 20 FC AL ports on loop 1 Initializing Loop 2 ISP2100 firmware status 3 20 FC AL ports on loop 2 Initializing SVD Services disks in the system Found 9 disks in Ul Found 9 disks in U2 boot from encid 1 Booting from U1D1 Login root root password Once the array starts a full boot any system problems detected
160. lpcu2 power cooling unit TECTROL CAN 300 1454 01 0000 001784 u2pcul power cooling unit TECTROL CAN 300 1454 01 0000 001544 u2pcu2 power cooling unit TECTROL CAN 300 1454 01 0000 001545 ulmpn mid plane SCI SJ 370 3990 01 0000 000953 u2mpn mid plane SCI SJ 370 3990 01 0000 000958 156 Sun StorEdge T3 Array Field Service Manual November 2002 lt 10 gt fru stat CTLR STATUS STATE ROLE PARTNER TEMP ulctr ready enabled master u2ctr 31 0 u2ctr ready enabled alt master ulctr 30 5 DISK STATUS STATE ROLE PORT1 PORT2 TEMP VOLUME uldl ready enabled data disk ready ready 30 voll uld2 ready enabled data disk ready ready 31 voll uld3 ready enabled data disk ready ready 30 voll uld4 ready enabled data disk ready ready 29 voll uld5 ready enabled data disk ready ready 29 voll uld6 ready enabled data disk ready ready 29 vol3 uld7 ready enabled data disk ready ready 34 vol3 uld8 ready enabled data disk ready ready 37 vol3 uld9 ready enabled data disk ready ready 31 vol3 u2dl ready enabled data disk ready ready 33 vol2 u2d2 ready enabled data disk ready ready 38 vol2 u2d3 ready enabled data disk ready ready 36 vol2 u2d4 ready enabled data disk ready ready 37 vol2 u2d5 ready enabled data disk ready ready 34 vol2 u2d6 ready enabled data disk ready ready 36 vol4 u2d7 ready enabled data disk ready ready 35 vol4 u2d8 ready enabled data disk read
161. lt condition and write behind cache is disabled until the batteries are charged Checking the Battery 1 On the host use the telnet command with the array name or IP address to connect to the array mngt_host telnet array name Trying 123 128 123 015 Connected to 123 123 123 101 Escape character is pSOSystem 123 123 123 101 2 Log in to the array by typing root and the supervisor password at the prompts Chapter 7 Power and Cooling Unit Assemblies 87 88 3 Use the id read command to display battery life related information Unit number n 1 or 2 power cooling unit number n 1 or 2 id read unpcun Revision Manufacture Week Battery Install Week Battery Life Used Battery Life Span Serial Number Vendor ID Model ID Battery Warranty Date Battery Internal Flag 0000 00281999 00412001 275 days 730 days 001787 20011119142702 0x00000000 TECTROL CAN 300 1454 01 50 2 hours 12 hours 4 Use the refresh s command to check the status of a battery refresh cycle The following examples show a battery refresh in progress and a normal battery status no refresh cycle Current Time Sun StorEdge T3 Array Field Service Manual refresh s PCUL PCU2 Ul Completed Recharging Current Time Fri May 26 18 32 07 GMT 2002 Start Time Thu May 25 20 31 19 GMT 2002 Last Refresh Thu May 11 20 22 53 GMT 2002 Next Refresh Thu Jun 08 20 31 19 GMT
162. ltiple drives are combined into a single virtual drive to improve performance and reliability A protocol for remotely managing a computer network A form of dynamic random access memory DRAM that can run at higher clock speeds than conventional DRAM Located on the disk drive label the space that contains configuration data boot firmware and file system information A component within the power and cooling unit The UPS supplies power from a battery in the case of an AC power failure See Interconnect Card Also called a logical unit number LUN a volume is one or more drives that can be grouped into a unit for data storage Glossary 229 W write caching Data used to build up stripes of data eliminating the read modify write overhead Write caching improves performance for applications that are writing to disk 230 Sun StorEdge T3 Array Field Service Manual November 2002 Index SYMBOLS disk linkstat command 103 110 disk pathstat command 101 loop stat command 101 etc ethers file 131 etc hosts file 17 131 etc inetd conf file 16 etc nsswitch conf file 131 etc schd conf file 89 etc syslog conf file 17 usr sbin in rarpd daemon 131 var adm messages file 2 A APATH 103 Assertion reset log type 191 auto boot 9 B back end loop see loop problems batteries 87 etc schd conf file 89 checking 87 id read command 88 maintenance 89 not fully charged 87 refresh cycle 89 r
163. message was received 0x3 MEDIUM ERROR This indicates that the command terminated with a non recovered error condition that was probably caused by a flaw in the medium or an error in the recorded data This sense key may also be returned if the target is unable to distinguish between a flaw in the medium and a specific hardware failure sense key 0x4 OxE MISCOMPARE This indicates that the source data did not match the data read from the medium 0x0 NO SENSE This indicates that there is no specific sense key information to be reported for the designated logical unit This would be the case for a successful command or a command that received CHECK CONDITION or COMMAND TERMINATED status because one of the file mark EOM or ILI bits is set to one 0x2 NOT READY This indicates that the logical unit addressed cannot be accessed Operator intervention may be required to correct this condition 212 Sun StorEdge T3 Array Field Service Manual November 2002 0x1 RECOVERED ERROR This indicates that the last command completed successfully with some recovery action performed by the target Details may be determinable by examining the additional sense bytes and the information field When multiple recovered errors occur during one command the choice of which error to report first last most severe etc is device specific OxF RESERVED 0x6 UNIT ATTENTION This indicates that the removable medium may have been changed or the target h
164. n Sie die Batterie nicht ins Feuer Versuchen Sie auf keinen Fall die Batterie auszubauen oder wiederaufzuladen Caution Achtung Systemsteuerungskarten Batterien Panasonic Modells HHR200SCP enth lt eine Nickel Metall Hydridbatterie Werden bei der Behandlung oder beim Austausch der Batterie Fehler gemacht besteht Explosionsgefahr Tauschen Sie Batterien nur gegen Batterien gleichen Typs von Sun Microsystems aus Demontieren Sie die Batterie nicht und versuchen Sie nicht die Batterie auSerhalb des Ger ts zu laden Werfen Sie die Batterie nicht ins Feuer Entsorgen Sie die Batterie ordnungsgem f entsprechend den vor Ort geltenden Vorschriften Caution Achtung Das Netzteil des Geh useabdeckung Caution Achtung Bei Betrieb des Systems ohne obere Abdeckung besteht die Gefahr von Stromschlag und Systemsch den Einhaltung der Richtlinien f r Laser Sun Produkte die mit Laser Technologie arbeiten entsprechen den Anforderungen der Laser Klasse 1 Class 1 Laser Product Luokan 1 Laserlaite Klasse 1 Laser Apparat Laser Klasse 1 Caution Warnung Die Verwendung von anderen Steuerungen und Einstellungen oder die Durchfhrung von Prozeduren die von den hier beschriebenen abweichen knnen gefhrliche Strahlungen zur Folge haben x Sun StorEdge T3 Array Field Service Manual November 2002 Conformit aux normes de s curit Ce texte traite des mesures de s curit qu
165. n as seen from the Alternate Master Controller on page C 200 Interpreting Boot Messages Boot Message Acronyms The acronyms used in boot messages are given in TABLE C 6 TABLE C 6 Boot Message Acronyms Files and Directories Explanation XPT Refers to the SCSI transport driver QLCF Refers to the QLogic Fibre Channel driver ISP2x00 The intelligent SCSI processor used in the T3 ECC The error checking and correcting mechanism used in the Sun StorEdge T3 array controller XOR The exclusive OR logic operation used in RAID 5 PGR This is the persistent group reservation information user that exists when the Sun StorEdge T3 array is attached to a cluster ALPA The arbitrated loop physical address assigned to each device on a FC AL loop Appendix C Sun StorEdge T3 Array Messages 193 Boot Message Bracket Placement The synonymous boot message lines shown below gives the field service engineer information about how and where they are connected to the Sun StorEdge T3 array Found units ul ctr u2 ctr or Found units ul ctr u2 ctr The position of the brackets indicates which serial port is providing the output Brackets around ul ctr indicate that the boot messages are coming from the master controller s serial port The field service engineer is using tip to connect to that controller s serial port The same is true with u2 ctr Detecting FC AL Ports and Reporting Firmware Status Compare two sections of boo
166. n the array and prepare the PCU for return to service as described in Removing and Replacing a Power and Cooling Unit on page 85 5 Reset the date by typing bat n u x pcu y from the T3 CLI prompt where u x is the unit number and pcu y is the location number associated with the PCU that was just installed This command will zero out the Battery Warranty Date field and set the Battery Install Week field according to the T3 date setting Additionally this command will zero out the Battery Internal Flag field if it was set to 1 to indicate low battery 6 Type id write busage u x pcu y 0 from the T3 CLI prompt This command calculates the Battery Warranty Date and Battery Life Used To verify this you may type id read u x pcu y 94 Sun StorEdge T3 Array Field Service Manual November 2002 CHAPTER 8 Diagnosing and Correcting FC AL Loop Problems This chapter describes how to diagnose and correct back end FC AL drive loop problems with the array It contains the following sections m Overview on page 95 m Normal Status on page 96 a Diagnosing an FC AL Loop on page 105 m Repair Procedures on page 115 There are several failure conditions within the back end loop that do not appear as a failed FRU status These kind of failures can only be diagnosed by collecting data from various sources within the system such as iostat performance data CLI status commands
167. nd Normal Output gatest lt 8 gt loop stat Loop 1 lt 1 gt lt 2 gt Loop 2 lt 1 2 gt gatest lt 9 gt disk pathstat uld1 9 DISK PPATH APATH CPATH PATH POLICY FAIL POLICY uldl 0 U 1 U APATH APATH PATH uld2 0 U lU APATH APATH PATH uld3 0 U LU APATH APATH PATH uld4 0 U Log PPATH PPATH PATH uld5 0 U 1 U PPATH PPATH PATH uld6 O U LU PPATH PPATH PATH uld7 0 U lU PPATH PPATH PATH uld8 0 U lU PPATH PPATH PATH uld9 O U LU PPATH PPATH PATH pass gatest lt 10 gt disk pathstat u2d1 9 DISK PPATH APATH CPATH PATH POLICY FAIL POLICY u2dl Ou LU APATH APATH PATH u2d2 O U 1 U APATH APATH PATH u2d3 0 U 1 U APATH APATH PATH u2d4 O U LU APATH PPATH PATH u2d5 0 U LU APATH PPATH PATH u2d6 0 U 1 U APATH PPATH PATH u2d7 Ou Lo APATH PPATH PATH u2d8 0 U lU APATH PPATH PATH u2d9 0 U de O APATH PPATH PATH pass Where m O U means Loop 1 path_id 0 is Up m 1 U means Loop 2 path_id 1 is Up m 0 D means Loop 1 path_id 0 is Down m 1 D means Loop 2 path id 1 is Down m PPATH means primary path 102 Sun StorEdge T3 Array Field Service Manual November 2002 m APATH means alternate path m CPATH means current path m PATH PO m FAIL PO failover LICY means the preferred path notice the 3 6 s
168. net session 129 150 47 101 Log in to the array by typing root and the supervisor password at the prompts On the array type lt 34 gt vol recon volume name from_standby Start a second Telnet session with the array to check rebuild progress Check the rebuild progress Use the information in the PERCENT column and the TIME column which shows the elapsed time for estimating when the volume will complete reconstruction lt 35 gt proc list VOLUME CMD_REF PERCENT TIME COMMAND vl 20241 23 0 09 vol recon Sun StorEdge T3 Array Field Service Manual November 2002 Note If all power is removed from the array while the drive is being reconstructed the reconstruction process restarts at the beginning when power is restored Chapter 5 Disks and Drives 69 6 Check the drive status to ensure that the reconstruction of the replaced drive FRU has completed The following example shows a standby drive configured for each volume lt 43 gt fru stat CTLR STATUS STATE ROLE PARTNER TEMP ulctr ready enabled master u2ctr 305 u2ctr ready enabled alt master ulctr 305 DISK STATUS STATE ROLE PORT1 PORT2 TEMP VOLUME uldl ready enabled data disk ready ready 30 voll uld2 ready enabled data disk ready ready 31 voll uld3 ready enabled data disk ready re
169. netd root 140 1 0 Feb 08 0 00 usr sbin inetd s root 7715 7701 0 11 22 32 pts 18 0 00 grep inetd kill HUP 140 16 Sun StorEdge T3 Array Field Service Manual November 2002 Setting Up Remote Logging The Sun StorEdge T3 array can provide remote notification of array events to designated hosts using Simple Network Management Protocol SNMP traps To enable SNMP notification edit the etc syslog conf and the etc hosts files on the array to configure system message logging Because files cannot be edited on the array ftp the files to a host to make the edits and then ftp the files back to the array Refer to the Sun StorEdge T3 Array Administrator s Manual for instructions on setting up remote logging Chapter 2 Connecting to the Sun StorEdge T3 Array 17 18 Sun StorEdge T3 Array Field Service Manual November 2002 CHAPTER 3 Diagnosing T3 Array Problems This chapter provides the qualified service provider with troubleshooting techniques for the Sun StorEdge T3 array and contains the following sections Diagnostic Information Sources on page 19 Troubleshooting Flow Charts on page 22 Initial Troubleshooting Guidelines on page 25 Verifying the Data Host Connection on page 26 Storage Automated Diagnostic Environment Link Test on page 27 Checking Array Boot Status on page 27 Telnet Connection Status Checks on page 30 Testing the Array With Storage Automa
170. ng Drive Status The following sections describe commands for monitoring the status of the drives Disk status can be checked by using a variety of CLI commands This section discusses how to monitor the following Checking Drive Status Codes on page 60 Checking the Hot Spare on page 61 Checking Data Parity on page 62 Checking Drive Temperature on page 62 1 On the host use the telnet command with the array name or IP address to connect to the array mngt_host telnet array name Trying 129 150 47 101 Connected to 129 150 47 101 Escape character is Telnet session 129 150 47 101 59 voll mounted vol2 mounted vol3 mounted vol4 mounted 2 Log in to the array by typing root and the supervisor password at the prompts Checking Drive Status Codes amp Usethevol stat command to check drive status codes All drives should show a status of 0 under normal conditions lt 40 gt vol stat uldl 0 u2d1 0 uld6 0 u2d6 0 uld2 u2d2 uld7 u2d7 0 uld3 u2d3 uld8 u2d8 0 uld4 u2d4 uld9 u2d9 0 uld5 u2d5 The following table lists numeric drive status codes TABLE5 1 Drive Status Messages Value Description 0 Drive mounted 2 Drive present 3 Drive is spun up 4 Drive is disabled 5 Drive has been replaced 7 Invalid system area on drive 9 Drive not present D Drive is disabled and is possibly being reconstructed S Driv
171. nnect card 1 SS N i N NN EDI TORRID 9 Ade oo L L l 7 LED1 LED2 LED1 LED2 Interconnect card 2 FIGURE 6 1 Interconnect Card LEDs TABLE 6 1 Interconnect Card LED Descriptions Interconnect Card Status LED Green or Amber Description Off Interconnect card not installed not recognized Green solid Interconnect card OK Cable OK if present Green slow blink Interconnect card OK possible communication problem with other cards Cable may be bad OK to replace cable Amber solid Interconnect card firmware download in progress Amber slow blink Interconnect card failure OK to replace interconnect card Note Even if the LED indicates an interconnect card failure always verify the FRU status using the CLI before replacing the interconnect card Refer to Checking FRU Status on page 35 for instructions 76 Sun StorEdge T3 Array Field Service Manual November 2002 gt Removing and Replacing an Interconnect Card Caution Use the interconnect cables only for cabling Sun StorEdge T3 arrays together using the interconnect card connectors Do not use these cables for any other FC AL connection Caution The interconnect card is extremely sensitive to static electricity Use proper antistatic wrist straps and antistatic procedures when handling any FRU Caution Replace one interconnect c
172. normal normal normal Chapter 3 Diagnosing T3 Array Problems 35 Note The fru stat command reports temperature readings on the interconnect cards controller board disk drives and PCUs For the PCU the fru stat output does not display a numeric temperature but instead reports a temperature state For all other FRUs fru stat reports a numerical temperature System firmware monitors only the temperature state reported by the PCUs This means a high temperature reading on an interconnect card for example will not cause the firmware to take evasive action such as powering off the array Testing the Array With Storage Automated Diagnostic Environment Access the Storage Automated Diagnostic Environment main window and click the Diagnose link Then click the Diagnostics Tests link See the Storage Automated Diagnostic Environment User s Guide for instructions Identifying Miscabled Partner Groups If a partner group has booted successfully but is unable to establish a Telnet connection with the management host the partner group might be cabled together incorrectly The interconnect cable connections between dual controller units are critical for determining which unit is the master controller and which is the alternate master If the interconnect cables are not properly installed on the interconnect cards the top unit could boot as the master controller and the bottom unit would assume alternate master status Because the
173. ntroller Replacement Procedure on page 116 3 Off Line Drive Diagnostics and Replacement on page 117 4 Chassis Replacement Procedure on page 123 Interconnect Card Replacement Procedure A single interconnect card can be removed without affecting the customer operation assuming that the other card is working of course Data accessibility is maintained during the replacement and testing of a single interconnect card with no change in the host configuration For the example of a suspected loop 1 path 0 problem perform the following steps From the CLI disable the u1l1 interconnect card lt 1 gt disable ull1 When the u1l1 LED is flashing amber remove and replace the interconnect card from the u1l1 position See Removing and Replacing an Interconnect Card on page 77 From the CLI enable the u111 interconnect card lt 2 gt enable ulll Verify the repair by using the listed CLI status commands See Using CLI Diagnostic Commands on page 110 If this did not correct the problem proceed to replacing the RAID controller as described in the next section Chapter 8 Diagnosing and Correcting FC AL Loop Problems 115 RAID Controller Replacement Procedure If replacing the interconnect cards and cables did not resolve the loop 1 path 0 problem the next least disruptive repair action is the removal and replacement of a RAID controller In a partner group a single R
174. nvalid path specified OxF Flush in progress 0x10 Device is not present 0x11 Device is not online 0x12 Command s active 0x13 Failover in progress 0x14 Device is broken 0x15 Device is unavailable 204 Sun StorEdge T3 Array Field Service Manual November 2002 Stripe Type Messages Stripe type messages report the I O operation that was being performed when the stripe RAID stripe message occurred These do not necessarily indicate an I O operation failure These messages are found in the syslog file TABLE C 10 Stripe Type Messages Stripe Type Message Description Control Stripe 0x0100 0101 Dummy 0102 No_Alternate 0103 Alt_Possible 0104 Using_Alternate Header Stripe 0x0200 0201 CCR_Header 0202 Asynch_Header Read Stripe 0x0400 0401 RAID 1_Read 0402 RAID 1_Recon_Read 0403 RAID 1_Read_Check 0404 Data_Source 0405 RAID 5_Small_Read 0406 RAID 5_Recon_Read 0407 RAID 5_Rebuild_Read 0408 RAID 5_Stripe_Read Write Stripe 0x0800 0801 Cache_Write 0802 Cache_Insert Write Disk Stripe 0x0810 0811 RAID 0 Write to Disk 0812 RAID 1 Write to Disk 0813 RAID 0 Insert into disk Block Appendix C Sun StorEdge T3 Array Messages 205 TABLE C 10 Stripe Type Messages Continued Stripe Type Message Description 0814 0815 0816 0817 081 0819 0820 1001 1002 1003 2001 2002 2003 2021 2022 2023 2024 2025 4001 4002 4003 RAID 1 Insert into disk Block RAID 5 RMW to Disk RAID 5 Recon
175. ode has decided which copy of level 2 code to use it prints 1 or 2 to notify the user which copy is selected as a feedback The level 1 code starts at 0xFFF00100 and extends to OxFFF20000 If there is no valid level 2 code the code prints 0 after T3B and reboot 52 Sun StorEdge T3 Array Field Service Manual November 2002 Second Level Boot Code The second level boot code is comparable to the EPROM mode of the Sun StorEdge T3 Array except the auto bootmode will boot from ROM instead of from disks The level 2 code allows the user to m Set basic system configuration For example IP Gateway and bootmode m Allow tftpboot for the level 3 code m Allow autoboot for the level 3 code m Allow update of the ROM code when bootmode is t ftp Note Use the login prompt to set the IP address netmask and hostname instead of using the EP prompt Setting these parameters at the EPROM level will be lost There are two copies of level 2 code one in 0xFFF40000 0xFFF9FFFF the other in 0xFFFA0000 0xFFFFFFFF To update the EPROM the following commands should be used m ep download filename This updates the level 1 boot code Since there is only one copy of level 1 code in ROM if this update fails the controller may not be able to boot after the failure m ep download filename This updates the level 2 boot code Two copies of level 2 boot code are kept in ROM Level 1 boot code selects the most recent updated one to boot The
176. ol4 u2d7 ready enabled data disk ready ready 35 vol4 u2d8 ready enabled data disk ready ready 40 vol4 u2d9 ready enabled data disk ready ready 36 vol4 LOOP STATUS STATE MODE CABLE1 CABLE2 TEMP u211 ready enabled master installed 29 u212 ready enabled slave installed 31 0 ulll ready enabled master installed 29 5 ull2 ready enabled slave installed 30 0 POWER STATUS STATE SOURCE OUTPUT BATTERY TEMP FAN1 FAN2 ulpcul ready enabled line normal normal normal normal normal ulpcu2 ready enabled line normal normal normal normal normal u2pcul ready enabled line normal normal normal normal normal u2pcu2 ready enabled line normal normal normal normal normal Note The batteries in the power and cooling units recharge after powering on the unit During the recharge a fault message is displayed in the fru stat output for the batteries While the batteries are recharging write behind cache is disabled 146 Sun StorEdge T3 Array Field Service Manual November 2002 3 Use the vol add command to create the volume s on the alternate master as follows a Define the volume name vol add volume name b Define the drives data u2dn n on which the volume resides where u2 is the array unit number a dn n are the disk drives n 1 to 9 c Define the RAID level raid n where n 0 1 or 5 d Optional define the hot spare drive standby und9 where a u2 is the array unit number m d9 is the number of the hot spare disk d
177. ompleted Chapter 8 Diagnosing and Correcting FC AL Loop Problems 119 In Syslog 518 133 18 3 1833 18 318 218 218 x18 218 18 18 18 318 18 18 18 18 s 1873 18 218 18 218 ARES 219 210 219 Lo SELO sdo sl SAS 119 SES RE 19 10 320 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 ay 26 amp O AO vv O O 9O O O O MO IO IO O 9o O 9 IO 9 9o IO IO IO O IO IO IO IO O O O IO IO O O OO 03 18 22 22 22 23 28 38 38 38 38 38 38 41 50 LF 1 LPCT 1 LPCT 1 LPCT 1 LPCT 1 LPCT 1 LPCT 1 LPCT 1 LPCT 1 LPCT 1 50 50 50 50 50 50 50 50 11 31 32 32 32 32 32 32 32 32 32 33 41 43 04 ISR1 SVDT 1 ISR1 SVDT 1 LPCT 1 LPCT 1 LPCT 1 LPCT 1 LPCT 1 LPCT 1 LPCT 1 Saag 22 o o fdg find ullpl fdg find ulll ONDG Initiated G hp ce oe cc oe So co sc co co So co co co oo oe cs co co coca 1ctr ISP2200 2 Received LIP f7 e8 async event lctr Port event received on port 0 abort 0 111 ONDG Loop Fault Diag Initiated lctr Reserved A Loop A Mask lt 1 gt B Mask lt 1 gt 1d4 SVD_PATH_FAILOVER path_id ld5 SVD_PATH_FAILOVER p
178. ons as described in Static Electricity Precautions on page 5 a Unlock a FRU by pushing in on the latch handle s with a coin or small screwdriver to release the latch handle s b Pull the FRU straight out c Lock the FRU back into place in the new chassis by pushing in and securing the latch handle s with a coin or small screwdriver Caution Maintain disk positions or data could be lost d Remove and replace the controller card See Removing and Replacing a Controller Card on page 49 for instructions e Remove and replace the interconnect cards See Removing and Replacing an Interconnect Card on page 77 for instructions f Remove and replace the power and cooling units See Removing and Replacing a Power and Cooling Unit on page 85 for instructions g Remove and replace the disk drives See Repairing Disk Drives on page 64 for instructions Note When removing disk drives label each one with its slot position in the unit so you can be replace the drives in the correct slots Move the replacement chassis back into place m If you are mounting the chassis in a cabinet Chapter 9 Chassis Backplane Assembly 129 Prepare for the new chassis by installing the base plate Use the base plate from the old chassis a Align the new chassis with the side rails and slide the chassis into the cabinet a Replace the two screws at the back of the chassis to secure the chassis to the
179. ons marked on the equipment Ensure that the voltage and frequency of your power source match the voltage and frequency inscribed on the equipment s electrical rating label Never push objects of any kind through openings in the eguipment Dangerous voltages may be present Conductive foreign objects could produce a short circuit that could cause fire electric shock or damage to your eguipment Symbols The following symbols may appear in this book Caution Caution There is risk of personal injury and eguipment damage Follow the instructions Caution Caution Hot surface Avoid contact Surfaces are hot and may cause personal injury if touched f Caution Caution Hazardous voltages are present To reduce the risk of electric shock and danger to personal health follow the instructions Caution Off Removes AC power from the system Standby Standby The On Standby switch Ch is in the standby position Modifications to Equipment Do not make mechanical or electrical modifications to the equipment Sun Microsystems is not responsible for regulatory compliance of a modified Sun product Placement of a Sun Product Caution Caution Do not block or cover the openings of your Sun product Never place a Sun product near a radiator or heat register Failure to follow these guidelines can cause overheating and affect the reliability of your Sun product
180. or disk drive and the FRU number n TABLE D 2 lists the possible FRU variables as they appear in this appendix TABLE D 2 FRU Identifiers FRU Identifier Controller card uencidctr Power and cooling unit uencidpcu 1 2 Interconnect card uencidl 1 2 Disk drive uencidan Appendix D Unit number encid unit number 1 2 encid unit number 1 2 n pcu number 1 2 encid unit number 1 2 n interconnect card number 1 2 encid unit number 1 2 n disk drive number 1 to 9 Sun StorEdge T3 Array System Commands 217 218 Sun StorEdge T3 Array Field Service Manual November 2002 APPENDIX E FC AL Loop Identifiers This Appendix lists the FC AL loop identifies by AL_PA hex Switch hex and Setting decimal values The values are listed from lowest to highest priority The AL_PA value of 00 is reserved for an FL_PORT The value is not available TABLEE 1 Assigned Loop Identifier Switch Setting Switch Setting Switch Setting hex dec hex dec hex dec 00 2 2C 44 58 88 01 2D 45 59 89 2E 46 5A 90 2F 47 5B 91 02 03 04 30 48 5C 92 05 31 49 5D 93 32 50 5E 94 33 51 5F 95 06 07 08 34 52 60 96 2 3 4 5 6 7 8 9 09 35 53 61 97 36 54 62 98 37 55 63 99 0A 0B e cc OC 38 56 64 N 0D 39 57 65 m w 3A 58 66 3B 59 67 OE OF oR OO BR 219 TABLEE 1 Assigned Loop Identifier Switch Setting AL_PA
181. or the Sun StorEdge T3 array and all attached arrays will automatically shut down and power off 6 Insert the new interconnect card making sure that the card sits on the frame 7 Lock the new interconnect card in place by pushing in the latch handle Use a coin or small screwdriver to press in and secure the latch handle 8 Reconnect the interconnect cable to the interconnect card 9 Verify that the LEDs on the interconnect card show that the interconnect card has initialized properly 10 Verify the status of the interconnect card using the CLI Refer to Checking FRU Status on page 35 for instructions 78 Sun StorEdge T3 Array Field Service Manual November 2002 11 Type lpc version to view and verify the firmware level of the new interconnect card See Upgrading Interconnect Card Firmware on page 79 for instructions if necessary Upgrading Interconnect Card Firmware The interconnect card firmware is stored in the FLASH memory device on the interconnect card The array can be operational during the interconnect card firmware upgrade The firmware upgrade procedures that follow must be done through the Ethernet connection The latest firmware versions are located on the SunSolve web site http sunsolve sun com m The firmware must be resident on the host for this operation m The Sun StorEdge T3 arrays must have a supervisor password prior to attempting this procedure To upgrade the firmware see
182. orted Notes Ox5F PERSISTENT RESERVE OUT ye 0xD0 LUN FAILOVER yes Vendor specific 0x3C READ BUFFER yes Available in 1 18 2 0 1 0x3B WRITE BUFFER yes Available in 1 18 2 0 1 0x4D LOG SENSE no 0x4C LOG SELECT no 208 Sun StorEdge T3 Array Field Service Manual November 2002 Arbitrated Loop Physical Addresses AL_PA and Loop IDs TABLE C 12 Arbitrated Loop Physical Addresses and Loop IDs AL_PA SEL_ID Target AL_PA SEL_ID Target AL_PA SEL_ID Target hex hex dec hex hex dec hex hex dec EF 00 0 A3 2B 43 4D 56 86 E8 01 1 9F 2C 44 4C 57 87 E4 02 2 9E 2D 45 4B 58 88 E2 03 3 9D 2E 46 4A 59 89 E1 04 4 9B 2F 47 49 5A 90 EO 05 5 98 30 48 47 5B 91 DC 06 6 97 31 49 46 5C 92 DA 07 7 90 32 50 45 5D 93 D9 08 8 8F 33 51 43 5E 94 D6 09 9 88 34 52 3C 5F 95 D5 0A 10 84 35 53 3A 60 96 D4 0B 11 82 36 54 39 61 97 D3 0c 12 81 37 55 36 62 98 D2 0D 13 80 38 56 35 63 99 D1 OE 14 7C 39 57 34 64 100 CE OF 15 7A 3A 58 33 65 101 CD 10 16 79 3B 59 32 66 102 CC 11 17 76 3C 60 31 67 103 CB 12 18 75 3D 61 2E 68 104 CA 13 19 74 3E 62 2D 69 105 C9 14 20 73 3F 63 2C 6A 106 C7 15 21 72 40 64 2B 6B 107 C6 16 22 71 41 65 2A 6C 108 C5 17 23 6E 42 66 29 6D 109 Appendix C Sun StorEdge T3 Array Messages 209 TABLE C 12 Arbitrated Loop Physical Addresses and Loop IDs Continued AL_PA SEL_ID Target AL_PA SEL_ID Target AL_PA SEL_ID Target hex hex dec hex hex dec hex hex dec C3 18 24 6D 43 67 27 6E 110 BC 19 25 6C 44 68 26 6F 111 BA 1A
183. ory is 9 drive RAID 5 with no standby disk The volume is configured as follows m vol add v0 data u1d1 9 RAID 5 m vol init v0 sysarea m vol init vO data rate 16 Appendix B Sun StorEdge T3 Array System Defaults 171 Default Directories and Files TABLE B 4 lists the default file system shipped with the array TABLE B 4 Default Directories and Files Filename nb113 bin lplc_05 01 BITMAP SYS ep2_10 bin FLIST SYS cmdlog adm webgui etc hosts etc schd conf etc syslog conf syslog web snmp T3 mib Description Controller firmware RR Sum 23020 5000 Interconnect card FW RR Sum 63295 21 Contains a map of used and free blocks Controller EPROM flash RR Sum 3221 1023 Contains the file descriptors Log of all commands executed on the system Legacy directory formerly used for syslog files Contains old browser based admin files Default hosts with comments on format of file Battery refresh file Contents BAT _CYC 14 System logging configuration file Default system logging file SNMP required file Note At time of manufacture Sun StorEdge T3 array system disks do not contain controller firmware interconnect card binaries EP binaries or drive firmware images You can download all of these from the SunSolve web site 172 Sun StorEdge T3 Array Field Service Manual November 2002 APPENDIX C Sun StorEdge T3 Array Messages This appendix contains a description of array error mess
184. oup of up to 16 LUNs A host generated error message could indicate that the host cannot communicate with the array through the Fibre Channel Arbitrated Loop FC AL channel or that an excessive number of channel errors are occurring If the host loses access to the array through the channel connection then any host messages regarding the array will refer only to the LUNs In a partner group configuration where multi pathing failover has been established the failure of a channel path or array controller causes the host to redirect I O from the failed channel to the second FC AL connection A variety of software logging tools monitor the various branches of the storage network When an error is detected the error s severity level is categorized and classified Errors are reported or logged according to severity level TABLE 1 1 TABLE 1 1 Levels of Message Notification Message Level Description Error Indicates a critical system or storage network event or failure requiring immediate intervention or attention Warning Indicates a possible system or storage network event or failure requiring eventual intervention Notice Indicates a system event that could be a normal periodic notification a system fault operator keyboard commands or a result of other events Information Indicates a system event that has no impact upon the system or storage networks ability to perform tasks The syntax of the error message uses a field replace
185. p Coordinate and schedule this down time with the customer To view the available ofdg utility command parameters simply enter ofdg on the command line with no options lt 15 gt ofdg usage ofdg y health_check ofdg y fast_test u lt encid gt l 1 2 ofdg y fast_find u lt encid gt l 1 2 ofdg y find u lt encid gt l 1 2 The ofdg parameters are m health_check does a fast Go No Go test of both loops using the current loop configuration health_check uses fast_test but no other parameters are reguired See The health_check Option on page 113 for additional details m fast_test does a fast Go No Go test of the selected enclosure and loop with the current loop configuration See The ofdg fast_test Option on page 113 for additional details m fast_find does a fast Go No Go test of the selected enclosure and loop It also runs a simplified loop fault isolation diagnostic See The ofdg fast_find Option on page 114 for additional details m find does an extensive Go No Go test If loop failures are detected it automatically initiates the full loop fault isolation diagnostic This is similar to ondg find See The ofdg find Option on page 114 for additional details See Off Line Drive Diagnostics and Replacement on page 117 for a step by step description of using this utility to diagnose and replace a bad drive Caution There are limitations to using the ofdg utility Make
186. pes on page 174 FRU Identifiers on page 175 The following sub sections describe these components and list possible error and warning messages See the Sun StorEdge T3 Array Administrator s Manual for explanations of the more important error messages Message Types A syslog daemon exists in the hardware RAID controller that records system messages and provides for remote monitoring There are four levels of messages listed in TABLE C 1 in order of severity Refer to the Sun StorEdge T3 Array Administrator s Manual to use the set command to set the loglevel to receive notification of the various types of messages TABLEC 1 Message Types Message Type Definition Error Indicates a critical system event requiring immediate user intervention or attention For example an over temperature condition or a detected FRU being removed Warning Indicates a possible event requiring eventual user intervention For example a FRU being disabled and recovery procedure executed Notice Indicates a system event that might be a side effect of other events or may be a normal condition For example the power switch is turned off Information Indicates a system event that has no consequence on the running health of the system For example a good state of a FRU 174 Sun StorEdge T3 Array Field Service Manual November 2002 FRU Identifiers The syntax of the error message uses a FRU identifier to refer to a particular FRU in a d
187. plit LICY is not supported always PATH for path failover vs NONE for no The disk linkstat Command The disk linkstat command returns whether a device port link status register can be accessed by a controller in its current configuration If the link status register cannot be ac Note The controller A cessed this may indicate a path problem to those disk s ports Telnet session will always run the command through the master Ithough it is possible to connect directly to the alternate controller it is not supported CODE EXAMPLE 8 5 disk linkstat Command Normal Ouput lt 9 gt disk linkstat uld1 9 path 0 uldl 2 uld2 2 uld3 2 uld4 2 uld5 2 uld6 2 uld7 2 uld8 2 uld9 2 DISK LINKFAIL LOSSSYNC LOSSSIG PROTOERR INVTXWORD INVCRC 16 0 0 51 0 67 0 0 48 0 15 0 0 41 0 56 0 0 58 1 40 0 0 50 0 90 0 0 39 0 28 0 0 51 1 20 0 0 64 1 20 0 0 87 0 Chapter 8 Diagnosing and Correcting FC AL Loop Problems 103 104 The status for the command example shown below is correct for a split loop configuration CODE EXAMPLE 8 6 disk linkstat Command Split Loop Ouput From U1 Controller lt 24 gt disk linkstat uld1 9 path 0 DISK LINKFAIL LOSSSYNC LOSSSIG PROTO ERR INVTXWORD I VCRC ldl 1d2 1d3 1d4 1d5 1d6 1d7 1d8 1d9 CG EUR GE Grip ong T 2o 0 lt o co o o pass lt 25 gt OO OOO ONS O Oi Sii 0 0 o OOo NOP Oo 0
188. ppendix C Sun StorEdge T3 Array Messages 187 The above message is printed when the host port task FCCO receives an abort command from the initiator The initiator sends the abort when it detects an error on the target In this case the Sun StorEdge T3 array LUN being accessed on port 0 see Identifying Sun StorEdge T3 Array Ports and Loops on page 181 Check the host syslog You should see SCSI resets and retries that occurred at the same time Assertion and Exception Reset Messages These occur for one of 2 reasons a hardware fault generating an exception or a controller encounters an area of code designed to generate an assertion in certain scenarios They are somewhat analogous to a kernel panic in solaris An event has occurred or situation arises that could result in writing or reading bad data the controller then panics In a properly configured and healthy enterprise configuration this is no problem The other controller takes over the LUNs and disables the one experiencing the event In a workgroup configuration it resets and you lose access to your LUNs this presents a good argument for host based mirroring Note Important information regarding where the reset is recorded Each controller has a space in NVRAM where the last reset is stored When another exception or assertion reset occurs it will replace the information in NVRAM with the new information This information also follows the controller when i
189. r each Sun StorEdge T3 array TABLEF 1 Sun StorEdge T3 array Information Worksheet Management Host Application Host TFTP Host Host ID Host name Host IP address Gateway IP address Sun StorEdge T3 IP address Sun StorEdge T3 array name OS patch revision level VERITAS DMP release Primary application Sun Storage Automated Diagnostic Environment release 222 Sun StorEdge T3 Array Field Service Manual November 2002 TABLE F 1 Sun StorEdge T3 array Information Worksheet Continued Management Host Application Host TFTP Host Legend Required Field Optional Field Not Applicable Appendix F Sun StorEdge T3 Array Configuration Worksheets 223 224 Sun StorEdge T3 Array Field Service Manual November 2002 Glossary A administrative domain alternate master unit alternate pathing AP auto cache mode auto disable auto reconstruction B buffering Partner groups interconnected controller units that share common administration through a master controller The secondary array unit in a partner group that provides failover capability from the master unit A mechanism that reroutes data to the other array controller in a partner group upon failure in the host data path Alternate pathing requires special software to perform this function The default cache mode for the Sun StorEdg
190. r firmware For example lt 5 gt ver T3B Release 2 01 01 2002 07 30 19 16 42 10 4 35 134 Copyright C 1997 2001 Sun Microsystems Inc All Rights Reserved The ver command displays the header information Chapter 3 Diagnosing T3 Array Problems 33 4 Enter fru list to display the firmware for the disk drives interconnect card and EPROM level In the event of a FRU failure fru list output contains the serial numbers helpful in verifying correct FRU replacement lt 7 gt fru list ID TYPE VENDOR MODEL REVISION SERIAL ulctr controller card 0301 501 5710 02 020100 020101 112035 u2ctr controller card 0301 501 5710 02 020100 020101 112122 uldl disk drive SEAGATE ST336704FSUN A726 3CD1HMKJ uld2 disk drive SEAGATE ST336704FSUN A726 3CD1HH2A uld3 disk drive SEAGATE ST336704FSUN A726 3CD1H9WS uld4 disk drive SEAGATE ST336704FSUN A726 3CD1HM64 uld5 disk drive SEAGATE ST336704FSUN A726 3CD1HMC2 uld6 disk drive SEAGATE ST336704FSUN A726 3CD1HM63 uld7 disk drive SEAGATE ST336704FSUN A726 3CD1HE3A uld8 disk drive SEAGATE ST336704FSUN A726 3CD1HNKO uld9 disk drive SEAGATE ST336704FSUN A726 3CD1HM5P u2dl disk drive SEAGATE ST336704FSUN A726 3CD1HHH5 u2d2 disk drive SEAGATE ST336704FSUN A726 3CD1HMJC u2d3 disk drive SEAGATE ST336704FSU
191. restored to both units successfully define and mount the volume s on the alternate master Note Make sure that both units are online and that all LEDs are green It can take several minutes after powering on for the units to be ready 1 Start a Telnet session with the master controller unit a On the host use the telnet command with the array name or IP address to connect to the master unit telnet disk_tray_name Trying 129 150 47 101 Connected to 129 150 47 101 Pail Escape character is Telnet session 129 150 47 101 Note The Telnet session verifies that your network connection is good If you cannot connect through the Telnet session you might have miscabled the partner group See Identifying Miscabled Partner Groups on page 36 to determine if this is the problem If the partner group is cabled correctly then the IP address might not be assigned correctly If you suspect this as the problem verify the IP address in a serial cable connection and verify that the RARP server is functional b Log in to the array by typing root and your password at the prompts The array prompt is displayed 144 Sun StorEdge T3 Array Field Service Manual November 2002 2 Check the FRU status using the fru list and fru stat commands Make sure that all FRUs are displayed and that FRU conditions are good as shown in the following examples
192. rive lt 3 gt vol add volume name data undn n raid n standby und9 For example lt 4 gt vol add vol2 data u2d1 8 raid 5 standby u2d9 m vol2 is the volume name a u2d1 8 indicates the location of the volume unit 2 drive 1 through 8 m raid 5 is RAID level 5 m standby u2d9 is the location of the hot spare unit 2 drive 9 4 Check the status of the volumes to ensure that you created the volume correctly lt l gt vol stat lt l gt vol list The status of all drives must be 0 For example u2d1 u2d2 u2d3 u2d4 u2d5 u2d6 u2d7 u2d8 u2d9 0 0 0 0 0 0 0 0 0 capacity raid data standby 236 058 GB 5 uld1 8 uld9 236 058 GB 5 u2d1 8 u2d9 Chapter 10 Hardware Reconfiguration 147 5 Initialize the volumes lt 3 gt vol init voll data lt 3 gt vol init vol2 data 6 Mount the volumes lt 3 gt vol init lt 3 gt vol init 7 Use the format command on a Solaris host to find out infromation about the new volume The format command probes for new devices and provides information about them including their sizes and pathnames Refer to the format man page for more information on this ocmmand 148 Sun StorEdge T3 Array Field Service Manual November 2002 Disconnecting a Partner Group to Form Single Controller Units Caution Back up all data before beginning this procedure This section describes how to reconfigure a partner group to form two ex
193. rmine if this is true enter fru myuid and SYS STAT fru myuid ul Chapter 3 Diagnosing T3 Array Problems 29 If you are connected to the alternate stop the tip session reconnect the serial cable to the master unit and start the session again Verify that the role of the unit to which you are connected is specified as Master Telnet Connection Status Checks Check array status using a variety of CLI commands This section contains the following topics m Determining Failover on page 30 a Verifying the Firmware Level and Configuration on page 32 m Checking FRU Status on page 35 Determining Failover 1 On the host use the telnet command with the array name or IP address to connect to the array mngt_host telnet disk tray name Trying 172202574305 Connected to auggie Central Sun COM Escape character is Telnet session 172 20 57 30 2 Log in to the array by typing root and the supervisor password at the prompts 30 Sun StorEdge T3 Array Field Service Manual November 2002 3 To determine which unit is the master or alternate master unit enter sys stat The following example shows a partner group in a normal state lt 2 gt sys stat Unit State Role Partner ONLINE Master 2 2 ONLINE AlterM 1 In a failover state unit 2 assumes the role of master controller and unit 1 is disabled as shown in the following example lt 3 gt sys
194. rred Notification of an event m 0x0 Pathevent detected on m Oxc Disk identified as having the error SEL_ID column of the internal AL_PA chart or lt 1 gt sim f num 0 id2alpa Oxc pass gt loopid alpa gt 0xc 0xd3 m 0x0 LUN which disk is a part of ISR1 2 N u2d1 SCSI Disk Error Occurred path 0x1 Appendix C Sun StorEdge T3 Array Messages 185 where m u2d1 Path error detected on m SCSI Disk Error Occurred Notification of an event m 0x1 Disk where error is occurring RAID Stripe ISR1 2 N u2d8 sid 234096 stype 2023 disk error 3 where m sid 234096 RAID stripe in cache m stype 2023 RAID stripe type see table m error 3 Specific error type see table SCSI Disk Errors These events are recorded by a sequence of 4 messages describing the disk having the error the path the error is detected on the actual error a translation and the Valid Information field The 1st and 3rd lines are the most important since they tell us which disk had the error and what that error was 1 09 58 43 ISR1 1 N u1d3 SCSI Disk Error Occurred path 0x1 2 09 58 43 ISR1 1 N Sense Key 0x1 Asc 0x17 Ascq 0x1 3 09 58 43 ISR1 1 N Sense Data Description Recovered Data With Retries 4 09 58 43 ISR1 1 N Valid Information 0x26af795 Line 1 Tells us an error occurred and on what disk Line 2 A detailed description of the error reported See
195. s on the array type help at the prompt lt 1 gt help arp help tail boot more sync refresh cat cd cmp cp date echo head LS mkdir mv ping pwd rm rmdir touch disable disk enable fru id logger lpc ntp passwd port proc reset set shutdown sys tzset ver vol volslice ep route ofdg lun hwwn 10 Sun StorEdge T3 Array Field Service Manual November 2002 For more information on how to set up the syslog file and interpret it refer to the Sun StorEdge T3 Array Administrator s Manual for instructions on setting up remote logging For information on how to use the CLI commands see Sun StorEdge T3 Array Administrator s Manual Chapter 2 Connecting to the Sun StorEdge T3 Array 11 Establishing an FTP Session To establish an FTP session 1 Start an FTP session from the management host to the array For example mgmt host lt 15 gt ftp 123 123 123 2 Connected to 123 123 123 2 Escape character is Telnet session 123 123 123 2 NUPPC 2 0 0 G ready Name 123 123 123 2 root 2 Log in to the array by typing root Name 123 123 123 2 root root 331 Password required for root Password password 230 User root logged in ftp gt where password is the root password Note Be sure to set the Binary mode if transferring firmware Note If the root password has not been set the FTP login to the array will fail 12 Sun StorEdge T3 Array Field Service M
196. se A Loop A Mask lt 1 gt B Mask lt 1 gt uld4 SVD_PATH_FAILBACK path_id 1 uld5 SVD_PATH_FAILBACK path_id 1 uld6 SVD_PATH_FAILBACK path id 1 uld7 SVD_PATH_FAILBACK path id 1 uld8 SVD_PATH_FAILBACK path_id 1 uld9 SVD_PATH_FAILBACK path_id 1 u111 ONDG Loop Fault Diag HE FIND Initiated on ulll FIND Completed on ulll STATUS PASS ul PASS ONDG Completed CODE EXAMPLE 8 14 Chapter 8 Diagnosing and Correcting FC AL Loop Problems 121 12 Once the failed disk drive FRU has been identified remove the suspect disk drive from the configuration with the vol disable command lt 9 gt fru s uld1 9 DISK STATUS STATE ROLE PORTI PORT2 TEMP VOLUME uldl ready enabled data disk ready ready 30 voll uld2 ready enabled data disk ready ready 31 voll uld3 ready enabled data disk ready ready 30 voll uld4 ready enabled data disk ready ready 29 voll uld5 ready enabled data disk ready ready 29 voll uld6 ready enabled data disk ready ready 29 vol3 uld7 ready enabled data disk ready ready 34 vol3 uld8 ready enabled data disk ready ready 37 vol3 uld9 ready enabled data disk ready ready 32 vol3 lt 10 gt vol disable uld9 lt ll gt fru s uld1 9 DISK STATUS STATE ROLE PORT1 PORT2 TEMP VOLUME uldl ready enabled data disk ready ready 30 voll uld2 ready enabled data disk ready ready 31 voll uld3 ready enabled data disk ready ready 30 voll u
197. si amp Sun microsystems Sun StorEdge T3 and T3 Array Field Service Manual Sun Microsystems Inc 4150 Network Circle Santa Clara CA 95054 U S A 650 960 1300 Part No 816 4774 10 November 2002 Revision A Send comments about this document to docfeedback sun com Copyright 2002 Sun Microsystems Inc 4150 Network Circle Santa Clara CA 95054 U S A All rights reserved This product or document is distributed under licenses restricting its use copying distribution and decompilation No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors if any Third party software including font technology is copyrighted and licensed from Sun suppliers Parts of the product may be derived from Berkeley BSD systems licensed from the University of California UNIX is a registered trademark in the U S and other countries exclusively licensed through X Open Company Ltd Sun Sun Microsystems the Sun logo AnswerBook2 docs sun com JumpStart Sun StorEdge Storage Automated Diagnostic Environment SunSolve and Solaris are trademarks registered trademarks or service marks of Sun Microsystems Inc in the U S and other countries All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International Inc in the U S and other countries Products bearing SPARC trademarks are based upon an architecture developed by
198. sks done clt1d0 configured with capacity of 133 38GB AVAILABLE DISK SELECTIONS 0 c0t2d0 lt drive type unknown gt sbus 1f 0 SUNW fas e 8800000 sd 2 0 1 c0t3d0 lt SUN2 1G cyl 2733 alt 2 hd 19 sec 80 gt sbus 1f 0 SUNW fas e 8800000 sd 3 0 2 clt1d0 lt SUN T3 0100 cyl 34145 alt 2 hd 64 sec 128 gt sbus 1f 0 SUNW socal 1 0 s 0 0 ssd w50020 2300000121 0 Specify disk enter its number In this example device number 2 is a volume on the array as identified by the SUN T3 0100 label Sun StorEdge T3 Array Field Service Manual November 2002 Storage Automated Diagnostic Environment Link Test Use the Storage Automated Diagnostic Environment to verify the physical connection between the host array and any other physical devices and to determine the primary and alternate paths Access the Storage Automated Diagnostic Environment main window and click the Diagnose link Then click the Diagnostics Tests link See the Storage Automated Diagnostic Environment User s Guide for instructions Caution Any Sun StorEdge T3 Array that is connected to a host via a switch by using F Ports on the array side will fail If the port is an F Port you need to remove the cable from the array before running Switchtest The link however works if the array is configured on a TL Port This note is in reference to BugID 4731718 Checking Array Boot Status Establish a serial connection with the array as described in Estab
199. slog in nvram loglevel 4 rarp on mac 00 20 2 00 93 24 6 Reset the master to initiate the t ftp boot cycle T3 1 gt reset Starting T3 1 Release 2 10 1999 11 24 13 05 57 123 123 123 3 Copyright C 1997 1999 Sun Microsystems Inc All Rights Reserved Found units ul ctr tftp boot is enabled hit the RETURN key within 3 seconds to cancel Initializing TFTP Loading 123 123 123 6 nb113 bin login 7 Copy the firmware from the t ftp boot server to the Sun StorEdge T3 array 14 Sun StorEdge T3 Array Field Service Manual November 2002 Note that the ftp command is initiated from the TFTP server since at this point you are no longer on the array mgmt_host ftp 123 123 123 3 Connected to 123 123 123 3 220 123 123 123 3 pSOSystem FTP server NUPPC 2 0 0 G ready Name 123 123 123 3 root root 331 Password required for root Password 230 User root logged in ftp gt led tftpboot Local directory now tftpboot ftp gt bin 200 Type set to I ftp gt put filename bin 200 PORT command successful 150 Opening BINARY mode data connection for filename bin 226 Transfer complete local filename bin remote filename bin 2514468 bytes sent in 51 seconds 47 87 Kbytes s ftp gt Where filename bin is the name of the current firmware file For example nb113 bin 8 Boot the newly transferred controller firmware image on the master lt 3 gt boot i filename
200. st sous tension courant alternatif Un des symboles suivants sera peut tre utilis en fonction du type d interrupteur de votre syst me Caution ARRET votre syst me est hors tension courant alternatif Caution VEILLEUSE l interrupteur Ch Marche Veilleuse est en position Veilleuse Modification du mat riel Ne pas apporter de modification m canique ou lectrique au mat riel Sun Microsystems n est pas responsable de la conformit r glementaire d un produit Sun qui a t modifi Positionnement d un produit Sun Caution Attention pour assurer le bon fonctionnement de votre produit Sun et pour l emp cher de surchauffer il convient de ne pas obstruer ni recouvrir les ouvertures pr vues dans l appareil Un produit Sun ne doit jamais tre plac proximit d un radiateur ou d une source de chaleur Caution Attention le niveau de pression acoustique au poste de travail s l ve selon la norme DIN 45 635 section 1000 70 dB A ou moins Conformit SELV S curit les raccordements E S sont conformes aux normes SELV Connexion du cordon d alimentation Caution Attention les produits Sun sont con us pour fonctionner avec des alimentations monophas es munies d un conducteur neutre mis a la terre Pour carter les risques d lectrocution ne pas brancher de produit Sun dans un autre type d alimentation secteur En cas de doute qu
201. stat Unit State Role Partner 1 DISABLED Slave 2 ONLINE Master 4 Use the port list command to display how paths are mapped from the host ports to the volume This displays World Wide Names WWNs that can be compared to the WWNs displayed by the Solaris command format 1M lt 4 gt port list port targetid addr_type status host wwn ulpl 1 hard online sun 50020 23000002ba u2pl 2 hard online sun 50020f23000002cd mgmt host format Searching for disks done AVAILABLE DISK SELECTIONS 0 c0t0d0 lt SUN4 2G cyl 3880 alt 2 hd 16 sec 135 gt pci 1f 4000 scsi 3 sd 0 0 1 c2tld0 lt SUN T300 0101 cyl 34145 alt 2 hd 64 sec 128 gt pci 6 2000 SUNW ifp l ssd w50020f23000002ba 0 Specify disk enter its number In the example above the WWN of 50020f23000002ba identifies the port and volume match Chapter 3 Diagnosing T3 Array Problems 31 Verifying the Firmware Level and Configuration The Sun StorEdge T3 array has four different types of firmware m Controller firmware See the Sun StorEdge T3 Array Installation and Configuration Manual m Interconnect card firmware See the Sun StorEdge T3 Array Installation and Configuration Manual m Controller electrically erasable programmable read only memory EPROM firmware See Controller EPROM Firmware on page 51 m Disk drive firmware See Check the drive status to ensure that the reconstruction of the replaced drive FR
202. stomer regularly runs a performance monitoring program where thresholds have been set the iostat command shows whether one path to a Sun StorEdge T3 array partner group is not performing to the established base line For example CODE EXAMPLE 8 8 iostat Output for Normal Baseline Operation N oooooooonmum x o a o o oo o o oo oo oo 2 090 Nn Mr s Mw s wait actv wsvc_t asvc_t w b device 0 0 0 0 0 0 0 0 0 clt6d cotod c0t2d c2t7d c2t6d c3tld c4t2dl c5tldlr 60 c5t1d0 normal uictr I O c3tldl c4t2d0 19 7 0 59 c6t2d1 normal u2ctr I O 0 O O c6t2d0 0 0 0 0 0 0 OO O Ove o ws o N OG bero o 2 O OOO O OO OLO O O orco oO Oto Si o w Ww N O U O O OnO O O OG o 2 hh O Pe E E UO O OOo O_O oe oooooooooo oeoouooooooo coooooooooooo oeooooooooooo cooooooooooo o o CODE EXAMPLE 8 9 iostat Output for Abnormal Problem Operation oS OO gu x O Mr s Mw s wait actv wsvc_t asvc_t w b device 0 0 0 0 0 O 0 clt6d0 c0t0d0 c0t2d0 c2t7d0 c2t6d0 c3tld0 c4t2dl 7 c5tldl 3 6 2 1 0 1 0 4 0 0 0 1 0 0 4 4 0 5 c5t1d0 abnormal utctr I O 0 c3tldl 0 c4t2d0 4 6 0 37 c6t2d1 normal u2ctr I O 0 0 c6t2d0 O o oo O e co oooooooo O O o O lt o o co ooooooo o O o Oo ooooooo coe to O co oo oooooooo o OL oooooooo Oo oO o oooooooo co O o o O c co ooooooo OG di Gto ec o 00 o o o OC 0 0 0 0
203. subject line of your email xxxii Sun StorEdge T3 Array Field Service Manual November 2002 CHAPTER 1 Troubleshooting Overview This chapter provides an introduction to some of the tools available to troubleshoot the Sun StorEdge T3 array and describes the following sections Network Storage Overview on page 1 Maintenance Precaution on page 2 Error Messages and Logs on page 2 Sun Storage Automated Diagnostic Environment on page 4 Static Electricity Precautions on page 5 Network Storage Overview An understanding of a network storage environment is required before any troubleshooting can take place Each Sun StorEdge T3 array can be configured with a maximum of two RAID volumes If desired these volumes can be partitioned into up to 16 distinct logical unit numbers LUNs Volumes need not be partitioned with an equal number of LUNs however the total must not exceed 16 The Sun StorEdge T3 array partner group consists of two independent controller RAID units sharing only one of the controllers for system management If one controller fails the system management facilities fail over to the other RAID controller This configuration gives the partner group redundancy Thus when configured as a Sun StorEdge T3 array enterprise configuration which consists of two arrays a maximum of four RAID volumes LUNs are available to the server for data delivery and retrieval Although an addi
204. support for redundancy m Revision checking m Remote notification through SRS SRS NetConnect RSS HTTP SSTR and SMTP Providers or email m Support for storage area networks SANs and direct attached storage DAS devices The Storage Automated Diagnostic Environment can be downloaded from the Sun web site See the Storage Automated Diagnostic Environment User s Guide for instructions Errors in the host data channel are outside of the scope of the Sun StorEdge T3 Array Field Service Manual Host to array channel failures occur when the connection between the array and the host is either severed or intermittent The components that make up this data channel connection can include m Host bus adapter HBA which resides on the host 4 Sun StorEdge T3 Array Field Service Manual November 2002 Gigabit interface converter GBIC adapter used to connect the FC AL cable to an SBus HBA Fibre Channel cable that connects the array to the host Media interface adapter MIA which converts the light source from the host to an electron source for use in the array Channel interface port in the array Fibre Channel switches connecting the host to the storage in a SAN To determine failures in the data path use a host based application diagnostics product such as the Sun Storage Automated Diagnostic Environment for the Solaris operating environment Static Electricity Precautions Follow these procedures to prevent damaging t
205. sure you are aware of these limitations before running ofdg The following are limitations for using ofdg m Before running the ofdg utility all disks other than those located in the ul tray must be assigned to a LUN Problems might occur if ofdg is run on systems where non u1 disks have not been assigned to volumes m ofdg does not detect missing loop cables Chapter 8 Diagnosing and Correcting FC AL Loop Problems 111 m ofdg output goes to the syslog and serial port only m ofdg assumes at least one back end loop cable is functional After installing a new drive wait two minutes before running ofdg Follow these steps to run ofdg 1 Perform an ofdg health_check operation lt 1 gt ofdg health_check All loops are given either a Go or No Go status m If there is a Go status this indicates that the ofdg test did not detect any problems with the configuration and that there is no need for further tests m If there is a No Go status proceed to the next step 2 Perform an ofdg fast_test operation lt 2 gt ofdg fast_test ulll All loops are given either a Go or No Go status m If there is a Go status this indicates that the ofdg test did not detect any problems with the configuration and that there is no need for further tests m If there is a No Go status proceed to the next step 3 Perform an ofdg fast_find operation lt 3 gt ofdg fast_find ulll The loop is given a Go or No Go s
206. systems Inc All Rights Reserved Initializing software Found units ul ctr u2 ctr Default master is ul Default alternate master is u2 Waiting for Master to come up Starting Heartbeats Initializing system drivers Initializing XPT component Initializing OLCF component Initializing loop 1 ISP2200 firmware status 3 Detected 20 FC AL ports on loop 1 Initializing loop 2 ISP2200 firmware status 3 Detected 20 FC AL ports on loop 2 Initializing SVD services Detected data cache size in system 1GB Testing ISP2200 Passed Testing ECC mechanism Passed Testing XOR functions and datapaths Passed Cold Boot detected destructive tests OK Testing data cache memory Passed Initializing Cache Memory 200 Sun StorEdge T3 Array Field Service Manual November 2002 Initializing loop 2 to accept SCSI commands Starting Syslog Daemon Waiting for configuration data from master Initializing host port u2pl ISP2200 firmware status 7 Host port u2pl TARGET_ID Oxffff ALPA 0x5 Starting psh Login Task List Tasks on a Sun StorEdge T3 array correspond to processes on a Solaris system The following are typical Sun StorEdge T3 array tasks TMRT Timer Task Handles fru removal time out LXRO Handles incoming messages from loop card serial port one for each loop card LXR1 Handles incoming messages from loop card serial port one
207. t Protocol 17 sn 169 SNMP 17 static electricity 5 status codes 60 Storage Automated Diagnostic Environment 4 StorTools 4 Sun Documentation Online xxxii Sun StorEdge T3 disk tray boot defaults 169 cable assemblies 167 commands 10 controller cards 165 default directories 172 default settings 135 disk tray 160 disks amp drives 59 door assembly 162 drive assemblies 166 FC AL 96 files 172 FTP connection 12 interconnect assemblies 167 interconnect cards 75 163 overview 1 parts 159 remote logging 17 serial connection 7 system defaults 169 Telnet connection 9 tftp booting 13 troubleshooting introduction 1 worksheets 221 SunSolve web site 32 sys command 31 sys stat command 31 SysFail reset log type 191 syslog daemon 2 syslog file 7 108 syslog conf file 17 system defaults 170 remote logging 17 verifying firmware level 32 system area recovery 40 System generated messages 2 T Takeover reset log type 191 234 Sun StorEdge T3 Array Field Service Manual November 2002 telnet command 10 59 68 master vs alternate controller 103 Telnet connection 9 tftp filename 14 tftpboot command 13 16 tftpfile 170 tftphost 169 tip command 8 tools troubleshooting 19 troubleshooting flow charts 22 info sources 19 initial steps 25 tools 19 typographic conventions xxx U UPS battery see batteries V vendor 170 vendor ID field 171 ver command 33
208. t cards or UIC can be replaced without affecting the online operation of the product though there may be some performance impact See Interconnect Card Replacement Procedure on page 115 3 Isolate replace and verify the RAID controllers Replacing RAID controllers causes a LUN controller path failover This failover might require some kind of manual procedure by the customer to continue running and it might affect the overall system performance See RAID Controller Replacement Procedure on page 116 4 Isolate replace and verify the FC AL disk drives Perform the this step only if steps 2 and 3 fail to resolve the problem To run the loop diagnostics to identify a failed drive FRU the Sun StorEdge T3 array must be removed from operation Removing the array is highly disruptive to the customer See Off Line Drive Diagnostics and Replacement on page 117 5 Replace and verify the chassis and mid plane If by the end of step 4 there is still a problem the chassis and mid plane will need to be replaced Perform this step only if steps 2 3 and 4 fail to resolve the problem This is highly disruptive to the customer See Chassis Replacement Procedure on page 123 and Replacing the Chassis Backplane Assembly on page 126 Normal Status The normal configuration information can be determined by using the following CLI commands and interpreting the results m fru stat see The fru stat Command on
209. t is replaced It is therefore important to capture this information when returning a controller for CPAS It is also a good idea to clear the reset log on the new controller See Reset Log Message Types on page 191 Note Regarding an enterprise configuration The reset you see in the syslog from extractor or a reboot is from the active master controller To dump the log on the alternate you will need to use the serial port and run the commands from there Assertion and 2004 Exceptions are software related 2003 Exceptions are hardware related However you can get an Assertion when a FRU fails causing a retry threshold on RAID reads to be exceeded The useful information is the first line of an assertion and exception It indicates what type of Reset occurred This provides an idea of how to proceed Without access to the source code these messages are almost no value other than indicators that something happened They must always be interpreted in the context of other events m SysFail cache parity reset replace the controller 188 Sun StorEdge T3 Array Field Service Manual November 2002 m Assertion look at the source code go to the line in the file referenced and determine based on syslog events whether it is relevant m Exception hard to say since there no source to reference In these cases you have to wing it Are there any failed FRUs Examples Cache Parity Error Replace Ctlr 15 18 35 t3a pshc 1 W
210. t messages one from the master controller and one from the alternate master controller Initializing loop 1 ISP2200 firmware status 3 Detected 19 FC AL ports on loop 1 Initializing loop 2 ISP2200 firmware status 3 Detected 19 FC AL ports on loop 2 and Initializing loop 1 ISP2200 firmware status 3 Detected 20 FC AL ports on loop 1 Initializing loop 2 ISP2200 firmware status 3 Detected 20 FC AL ports on loop 2 These messages are generated by the ISP devices that service the back end loops They are polling the FC AL loops for FC AL devices The first section of output from the master controller detects 19 FC AL ports The next section detects 20 FC AL ports The missing port is actually the alternate master controller It is missing because it has not completed its own boot process when the master controller polls FC AL devices Once the alternate master boots it also polls for FC AL devices Since the master controller and all the drives are already running 20 9 drives per Sun StorEdge T3 array a master controller and an alternate master controller devices are found on the FC AL loop at this time 194 Sun StorEdge T3 Array Field Service Manual November 2002 The firmware status codes generated during the boot cycle can be good indicators of internally detected system and configuration problems TABLE C 7 specifies the firmware status codes that can be reported through a serial port console
211. t onto the new chassis Pull out tab FIGURE 9 2 Serial Number and MAC Address on Pull out Tab Chapter 9 Chassis Backplane Assembly 127 7 On the Sun StorEdge T3 array disconnect all external cables Disconnect all power interconnect host FC AL MIA and Ethernet cables Note If the array is part of a partner group note down the placement of the host FC AL connections and loop cables You need this information in Step 12 8 Remove the chassis if it is mounted in a cabinet a Remove the two screws at the back of the chassis that secure it to the side rails in the cabinet b Slide the chassis out of the cabinet FIGURE 9 3 ssssss n h sssssss0s50909099999S9SS y 4 yw 4 00000000f 6 5 fpoc00000 7 x Tocccced 9Of booooooo FIGURE 9 3 Removing the Chassis 128 Sun StorEdge T3 Array Field Service Manual November 2002 10 11 Move the failed array to an area that both the front and back can be easily accessed Caution Use two people to lift and move the array It can weight up to 67 lbs 30 kg One at a time remove all the FRU components from the failed chassis and put them in the new chassis ensuring same location placement Caution FRUs are extremely sensitive to static electricity Use proper antistatic wrist strap and procedures when handling any FRU Observe all static electricity precauti
212. tatus with progress indications If a failure is reported on the first or nearest enclosure then the loop card in that enclosure should be swapped before repeating the test with the next unit If a failure is reported for the second or further enclosure fast_find isolates the bad FRU s to either a bad interconnect cable or the two interconnect cards which are connected to the interconnect cable in question In this case fast_find should be run from the partner controller to eliminate some FRUs If after running fast_find in both directions the problem has not been isolated to a single bad FRU the bad FRU might be either the interconnect cable the interconnect card or both a Replace the interconnect cable and retest b Replace the interconnect card and retest 112 Sun StorEdge T3 Array Field Service Manual November 2002 If the problem persists continue to the next step 4 Perform an ofdg find operation lt 4 gt ofdg find ulll The loop is given a Go or No Go status with progress indications If a failure is detected then Loop Fault Diag is automatically invoked to find the bad disk ports If ofdg find is not successful in solving the problem the backplane should be suspected See Replacing the Chassis Backplane Assembly on page 126 for details The health_check Option The health_check option provides a fast Go No Go Loop test for all the loops in the array The health_check option calls f
213. tebehind on vol2 yes writebehind on vol3 yes writebehind on vol4 yes writebehind on Chapter 8 Diagnosing and Correcting FC AL Loop Problems 99 The port listmap Command The port listmap command returns the current controller to volume path One controller controlling all the configured volumes might indicate loop problems CODE EXAMPLE 8 2 port listmap Command Normal Output lt 3 gt port listmap port targetid addr_type lun volume owner access ulpl 1 hard 0 voll ul primary ulpl 1 hard 1 voll Uil primary ulpl 1 hard 2 voll ul primary ulpl di hard 3 voll ul primary ulpl iL hard 4 voll ul primary ulpl al hard 9 voll ul primary ulpl L hard 6 voll ul primary ulpl 1 hard di voll ul primary ulpl 1 hard 8 voll ul primary ulpl 1 hard 9 voll ul primary ulpl 1 hard 10 vol2 u2 failover ulpl 1 hard 11 vol2 u2 failover ulpl 1 hard 12 vol3 wi primary ulp1 I hard 13 vol3 ul primary ulpl 1 hard 14 vol4 u2 failover ulpl 1 hard 15 vol4 u2 failover u2p 2 hard 0 voll ul failover u2p 2 hard 1 voll ul failover u2p 2 hard 2 voll ul failover u2p 2 hard 3 voll ul failover u2p 2 hard 4 voll ul failover u2p 2 hard 5 voll ul failover u2p 2 hard 6 voll ul failover u2p 2 hard 7 voll WT failover u2p 2 hard 8 voll ul failover u2p 2 hard 9 voll ul failover u2p 2 hard 10 vol2 u2 primary u2p 2 hard 11 vol2 u2 primary u2p 2 hard 12 vol3 ul failover u2p 2 hard 13 vol3 ul failover u2p 2 hard 14 vol4 u2 primary u2p 2 hard 15 vol4 u
214. ted Diagnostic Environment on page 36 Identifying Miscabled Partner Groups on page 36 Identifying Data Channel Failures on page 39 Reserved System Area Recovery Procedure on page 40 Diagnostic Information Sources TABLE 3 1 summarizes the diagnostic tools available to you TABLE 3 1 Diagnostic Functions and Tools Function Tools That Can Be Used Array boot monitoring LEDs CLI S Array boot PROM CLI S commands Host data path diagnosis SADE Internal monitoring LEDs CLI E CLI S SNMP SADE syslog SRS 19 TABLE 3 1 Diagnostic Functions and Tools Continued Function Tools That Can Be Used Configuration LEDs CLI E CLI S System admin domain configuration System admin domain monitoring Version level check LUN configuration FRU failure monitoring Performance monitoring Firmware download Syslog access mgmt host Loop resiliency check manual Manual loop resiliency check Clear supervisor password Host data path diagnosis Statistics logging Service commands Mfg repair commands CLI E CLI S CLI E CLI S SRS SNMP CA syslog CA CLI E CLI S CM CLI E CLI S LED CLI E CLI S SRS SNMP CA syslog CA CLI E CLI S SNMP CA syslog CA CLI E CLI E CLI S syslog CA and SADE with 2nd copy of SADE running on management host with ethernet connection to array OFDG CLI E CLI S OFDG CLI E CLI S CLI S SADE syslog CA
215. the data host datahost drvconfig disks devlinks Note Any applications specifically dependent on the volume s device path also need to be changed Refer to each application s documentation for instructions 25 Execute a format command on the data host to verify that the Sun StorEdge T3 array devices are seen The Sun StorEdge T3 array volumes are now usable by the data host and can be mounted or re enabled with the appropriate volume manager software Chapter 9 Chassis Backplane Assembly 133 134 Sun StorEdge T3 Array Field Service Manual November 2002 CHAPTER 10 Hardware Reconfiguration This chapter provides procedures for reconfiguring existing array hardware to create new configurations It includes the following sections m Connecting Single Controller Units to Form a Partner Group on page 135 m Disconnecting a Partner Group to Form Single Controller Units on page 149 m Changing the Port ID on the Array on page 158 Connecting Single Controller Units to Form a Partner Group Caution This procedure destroys data Back up all your data before beginning this procedure This section describes how to reconfigure two existing single controller units that contain data to form a partner group redundant controller units You will need two interconnect cables to connect the units See Appendix A for a part number and illustration of the interconnect cable This procedure includes
216. tically assigned See Establishing a New IP Address on page 141 for detailed instructions Chapter 10 Hardware Reconfiguration 151 Establishing a Network Connection After powering on establish a network connection to each array This ensures that both arrays function properly and recognize the host On the host use the telnet command with the array name or IP address to connect to the array telnet array_name Trying 129 150 47 101 Connected to 129 150 47 101 Escape character is Telnet session 129 150 47 101 Note The Telnet session verifies that your netwrok connection is good If the IP address is not assigned correctly you need to verify the IP adddress in a serial cable connection to make sure that the RARP server is functional Log in to the array by typing root and your password at the prompts m If you are logging in to the previous master unit use the password for that unit m If you are logging in to the previous alternate master unit you need to assign a new password When prompted for a password press Return Note If you need to create a new password or change some of the parameters such as the gateway netmask and others refer to Chapter 2 of the Sun StorEdge T3 Array Installation and Configuration Manual for instructions Use the vol list and vol stat commands to verify that the phantom volume has been deleted and that the existing volume rema
217. tional two volumes can be added to the two available in a workgroup configuration the maximum total number of LUNs remains at 16 The interruption of data can happen anywhere on the storage network This manual addresses data interruption problems from the output of the host to the Sun StorEdge T3 array and to the individual components in the array Maintenance Precaution After configuring a system always record the following data to prepare for the possibility of having to perform a recovery procedure m Array block size a Multipathing settings m Volume configuration m Volume slicing configuration m LUN masking settings Error Messages and Logs Both the Sun StorEdge T3 array and the host server create log message files of system conditions and events These log files are the most useful immediate tools for troubleshooting Sun StorEdge T3 Array Generated Messages A syslog daemon in the Sun StorEdge T3 array writes system error message logs to a location determined by the site system administrator Consult with the site system administrator to obtain access to this log Refer to the Sun StorEdge T3 Array Administrators Manual for instructions on setting up remote logging Host Generated Message A syslog daemon in the host hardware writes system error message logs to var adm messages 2 Sun StorEdge T3 Array Field Service Manual November 2002 The data host sees an array or enterprise configuration as a gr
218. tr 31 0 u2ctr ready enabled alt master ulctr 3025 DISK STATUS STATE ROLE PORT1 PORT2 TEMP VOLUME uldl ready enabled data disk ready ready 30 voll uld2 ready enabled data disk ready ready 31 voll uld3 ready enabled data disk ready ready 30 voll uld4 ready enabled data disk ready ready 29 voll uld5 ready enabled data disk ready ready 29 voll uld6 ready enabled data disk ready ready 30 vol3 uld7 ready enabled data disk ready ready 34 vol3 uld8 ready enabled data disk ready ready 37 vol3 uld9 ready enabled data disk ready ready 32 vol3 u2dl ready enabled data disk ready ready 34 vol2 u2d2 ready enabled data disk ready ready 38 vol2 u2d3 ready enabled data disk ready ready 36 vol2 u2d4 ready enabled data disk ready ready 37 vol2 u2d5 ready enabled data disk ready ready 34 vol2 u2d6 ready enabled data disk ready ready 36 vol4 u2d7 ready enabled data disk ready ready 35 vol4 u2d8 ready enabled data disk ready ready 40 vol4 u2d9 ready enabled data disk ready ready 36 vol4 LOOP STATUS STATE MODE CABLE1 CABLE2 TEMP u211 ready enabled master installed 29c u212 ready enabled slave installed 31 0 ulll ready enabled master installed 29 5 ull2 ready enabled slave installed 30 5 POWER STATUS STATE SOURCE OUTPUT BATTERY TEMP FAN1 FAN2 ulpcul ready enabled line normal normal normal normal normal ulpcu2 ready enabled line normal normal normal normal normal u2pcul ready enabled line normal normal normal normal normal u2pcu2 ready enabled line normal normal
219. tromschl ge und Verletzungen zu vermeiden Caution Ein Setzt das System unter Wechselstrom Je nach Netzschaltertyp an Ihrem Ger t kann eines der folgenden Symbole benutzt werden Caution Aus Unterbricht die Wechselstromzufuhr zum Ger t Der Ein Wartezustand Schalter steht auf i Caution Wartezustand Stand by Position Wartezustand Anderungen an Sun Geraten Nehmen Sie keine mechanischen oder elektrischen Anderungen an den Ger ten vor Sun Microsystems iibernimmt bei einem Sun Produkt das ge ndert wurde keine Verantwortung f r die Einhaltung beh rdlicher Vorschriften Aufstellung von Sun Ger ten Caution Achtung Um den zuverl ssigen Betrieb Ihres Sun Ger ts zu gew hrleisten und es vor Uberhitzung zu schiitzen diirfen die Offnungen im Ger t nicht blockiert oder verdeckt werden Sun Produkte sollten niemals in der Nahe von Heizk rpern oder Heizluftklappen aufgestellt werden Caution Achtung Der arbeitsplatzbezogene Schalldruckpegel nach DIN 45 635 Teil 1000 betragt 70Db A oder weniger Einhaltung der SELV Richtlinien Die Sicherung der I O Verbindungen entspricht den Anforderungen der SELV Spezifikation Anschluf des Netzkabels den Betrieb an Einphasen Stromnetzen mit geerdetem Nulleiter vorgesehen Um die Stromschlaggefahr zu reduzieren schlie en Sie Sun Produkte nicht an andere Stromquellen an Ihr Betriebsleiter oder ein qualifiziert
220. un 02 05 42 05 FCC2 1 Jun 02 05 42 07 FCC2 1 Jun 02 05 42 07 FCC2 1 Attention Jun 02 05 42 07 FCC2 1 Jun 02 05 42 07 FCC2 2 Jun 02 05 42 07 FCC2 2 Jun 02 05 42 07 FCC2 2 Jun 02 05 42 08 FCC2 2 Attention Jun 02 05 42 08 FCC2 2 Z zZ Z Z Z Z N N ulctr Port event received on port 2 abort 0 id 1 ulctr ITL 1 0 0 TT 20 TID A308 OP 2A Target in Unit ulctr lt lt Abort Task Set gt gt on port 2 abort 0 ulctr Port event received on port 2 abort 0 id 1 ulctr Port event received on port 2 abort 0 id 1 ulctr ITL 1 0 0 TT 20 TID A50C OP 2A Target in Unit ulctr lt lt Abort Task Set gt gt on port 2 abort 0 u2ctr Port event received on port 5 abort 0 id 0 u2ctr Port event received on port 5 abort 0 id 0 u2ctr Port event received on port 5 abort 0 id 0 u2ctr ITL 0 1 1 TT 20 TID A6EC OP 2A Target in Unit u2ctr lt lt Abort Task Set gt gt on port 5 abort 0 182 Sun StorEdge T3 Array Field Service Manual November 2002 SVD SVC Error Messages ISR1 1 W SVC_PATH_FAILOVER path_id 0 lid 15 where m path_id 0 backend loop 1 m lid logical unit identification 15 u2d7 m SVD talks in terms of lid s LUN id See TABLE C 3 TABLE C 3 LIDs corresponding to LUN IDs example LID target_id LUN 1 8 uld1 8 08 Of 0 9 16 u2d1 8 10 17 0 17 uld9 97 0 18 u2d9 98 0 19 vO 1 0 19 vil 0 1 20 v2 1 2 20 v3 0 3 where vO v1
221. un com During a disk drive firmware download the functionality of the array is limited To avoid system problems verify that m current backup copy of array data exists m The data path to the host has been guiesced There must not be any I O activity during the disk drive firmware download m The Ethernet connection is not being used for any other operation during this procedure Caution If a host mounted utility program is actively polling problems might occur during the firmware download Disable the polling utility during this procedure to avoid problems m No unnecessary command line program interaction with the array is performed during the disk drive firmware download Note The disk firmware download takes approximately 20 minutes for 9 drives Do not attempt to interrupt the download or perform other command line functions during the process The command prompt reappears after the download process has completed To upgrade the firmware Use ftp to transfer the firmware to the array root directory in binary mode See Establishing an FTP Session on page 12 for additional information Note The file name of files being transferred to the local disk must be 12 characters or less in size and start with an alphabetic character not numeric Establish a Telnet connection to the array See Establishing a Telnet Session on page 9 Chapter 5 Disks and Drives 71 3 Verify that all
222. v2 v3 are volumes created in this order v0 on ul v1 on u2 v2 on ul v3 on u2 The lid 19 and 20 are the lun id assigned to the target cache mirroring LUN which is a virtual LUN that receives SCSI commands just like a real LUN Hence the variety of the aborted tasks and as seen above Therefore it is a shared resource and each controller holds a pointer to the virtual LUN representing the stripe set for the other controller s volume And each volume has it s own stripe set in cache AppendixC Sun StorEdge T3 Array Messages 183 In summary the targets of the volumes are LID Name target_id LUN 19 vO 1 0 19 vl 0 1 20 v2 1 2 20 v3 0 3 However the controller targets are m ul 0 m u2 1 The targets appear to be reversed because the mirror for u2v1 actually resides on ul So lid 19 for v1 has to have a target of 0 since to reach it an access ul is needed Therefore to eliminate confusion always try to stick with the following convention for volume creation m LUN 0 gt ul m LUN 1 gt u2 a LUN 2 gt ul m LUN 3 gt u2 Fatal Timeouts ISR1 2 N u2ctr ISP2100 1 Fatal timeout on target 0 1 ISR1 2 N u2ctr ISP2100 1 QLCF_ABORT_ALL_CMDS Command Timeout Pre Gauntlet Initiated where target 0 1 refers to portid target 0 and lun 1 The translation address_resolution to the FC drives is done at XPT SIM level and is using target lun format i e in this case target 0 1 0 is t
223. verify the netmask and IP address with the set command Are they correct Go To Can you access the array Procedure A MU CC YES Set the IP address p Manually and reboot the DONE RESERO array Can you Telnet into the array now YES NO NO YES DONE Go TO Procedure A FIGURE 3 2 Ethernet Troubleshooting Flow Chart Chapter 3 Diagnosing T3 Array Problems 23 Procedure A Note Ensure that the host and the array are on the same subnet Possible IP conflict Disconnect ethernet cable and ping the array s IP address Any response YES IP conflict Contact your site network administrator to resolve the conflict Replace network cable with a known good cable Can you access the array Change RAID controller board Set up IP address FIGURE 3 3 Procedure A 24 Sun StorEdge T3 Array Field Service Manual November 2002 Initial Troubleshooting Guidelines To begin a problem analysis check one or more of the following information sources for troubleshooting and or perform one or more of the following checks Troubleshooting Sources The array LEDs which provide a visual status as described in Sun StorEdge T3 Array Installation and Configuration Manual Sun StorEdge T3 array generated messages found in a log file indicating a problem or system status with the array See Sun StorEdge T3 Arr
224. y ready 40 vol4 u2d9 ready enabled data disk ready ready 36 vol4 LOOP STATUS STATE MODE CABLE1 CABLE2 TEMP u211 ready enabled master installed 295 u2 12 ready enabled slave installed 31 0 ulll ready enabled master installed 29 5 ull2 ready enabled slave os installed 30 5 POWER STATUS STATE SOURCE OUTPUT BATTERY TEMP FANL FAN2 ulpcul ready enabled line normal normal normal normal normal ulpcu2 ready enabled line normal normal normal normal normal u2pcul ready enabled line normal normal normal normal normal u2pcu2 ready enabled line normal normal normal normal normal If the array reports a ready status with functional FRUs you can now restore the data if necessary and return the array to operation as a single controller unit Chapter 10 Hardware Reconfiguration 157 Changing the Port ID on the Array To add a partner group to a hub configuration you must set the port ID values on the arrays to unique values Sun systems support hard addressing only However the port command on the Sun StorEdge T3 array contains the option to set soft addressing Changing the setting to soft addressing can create problems with host HBAs In addition with soft addressing there is the risk of ending up with new cxtxdx node names after performing a system reboot Note Sun StorEdge T3 arrays that are factory configured in cabinets with hubs have unique port ID values assigned This procedure applies only to standalone partner groups th
225. ying Sun StorEdge T3 Array Ports and Loops Ports m On a single Sun StorEdge T3 array there are 3 ports 2 backend 1 host port m Ona partner pair there are 6 4 backend 2 host ports Loops m FW 1 17 and older 2 internal loops 1 external host loop m FW 1 18 2 0 3 internal loops 1 external loop So for a T3PP m loopl path_id 0 connects ports 1 and 4 m loop2 path_id 1 connects ports 2 and 5 m host ports are 0 u1 3 u2 ui u2 alpa ef id 0 alpa e8 id 1 1 loop 1 4 path 0 u1p1 0 3 u2p1 path 1 2 loop 2 5 FIGURE C 1 Loop Port Diagram Note You will see references to ports in different contexts Although the above information is accurate each disk is also a port on the backend loops but is referenced differently They are referenced in the context o f SCSI errors and are identified with a hex number which corresponds to the SEL_ID column in TABLE C 12 So FCC2 2 N u2ctr port event received on port 5 abort 0 id 0 In this case ISP port 2 initiator id 0 loop 2 on ctrl1 did a login or logout generating on ISP port 5 on ctrl 2 a Port event Appendix C Sun StorEdge T3 Array Messages 181 Note Regarding these messages You should only see chatter between ports 1 and 4 when loop 2 has failed and loop 1 is healthy Backend Loop chatter Loop 2 cache mirroring Jun 02 05 41 34 FCC2 1 Jun 02 05 41 36 FCC2 1 Attention Jun 02 05 41 36 FCC2 1 J
226. yte is equal to one billion bytes 1x109 226 Sun StorEdge T3 Array Field Service Manual November 2002 graphical user interface GUI H hot spare hot swap I input output operations per second IOPS interconnect cable interconnect card L light emitting diode LED logical unit number LUN loop cable loop card A software interface that enables configuration and administration of the Sun StorEdge T3 array using a graphic application A drive in a RAID 1 or RAID 5 configuration that contains no data and acts as a standby in case another drive fails The characteristic of a field replaceable unit FRU to be removed and replaced while the system remains powered on and operational A performance measurement of the transaction rate An FC AL cable with a unique switched loop architecture that is used to interconnect multiple Sun StorEdge T3 arrays Sometimes referred to as a loop cable A array component that contains the interface circuitry and two connectors for interconnecting multiple Sun StorEdge T3 array units Sometimes referred to as a loop card A device that converts electrical energy into light that is used to display activity One or more drives that can be grouped into a unit also called a volume Interconnect cable Interconnect card Glossary 227 M master unit media access control MAC address media interface adapter MIA megabyte MB or Mbyte m
Download Pdf Manuals
Related Search
Related Contents
Le tour du corps en 80 pulsations Sharkoon Keto User Manual-SDE-3004 CHAPTER FIVE: LUGGAGE OPERATIONS AND PROCEDURES KF-L3040S取扱説明書 Manuale d`uso Manual Olimpus TM50 - Montreuil Liff_WaterSoftner_Installation Chamberlain 2280C 1/2 HP Garage Door Opener User Manual Copyright © All rights reserved.
Failed to retrieve file