Home
        Method to recover from a boot device failure during reboot or system
         Contents
1.  0008  During a reboot or Initial program Load  IPL  of a  POWERS server  the S HMC displays progress codes  The  progress codes are numeric characters which are displayed  sequentially  The progress codes indicate the state of the  reboot or IPL progress  If the reboot or IPL fails or is stalled   the displayed progress code indicates at which point the fail  occurred  An indication that a reboot or IPL has failed is the  fixed display of the same progress code     0009  Depending on the boot status failure  the required  action necessary for repair may lead to a physical removal of  the failed boot device  a replacement of the failed boot device   a modification of the boot list  or a rebuild of the boot devices   For example  when the boot is stuck with status 0557  a failure  to read the boot sector   the plan of action requires the pull of  the failed boot device  The mirror boot device must then be  used to retry the boot     0010  All the recovery actions of a failed reboot or a failed  IPL require manual intervention from the system administra   tor or an IBM representative  The boot recovery action is  prone to further problems due to user mistakes during the  repair action  With manual intervention  it takes time to per   form the repair actions  In addition  while the failure to boot  or IPL persists  the DS8000 is in single logical partitioning   LPAR  mode  This increases the likelihood of exposure to a  situation where the storage controller is not available if 
2.  in real time and without user intervention   recover from a boot failure and an IPL failure of a POWERS  Server     DETAILED DESCRIPTION     0015  Reference will now be made in detail to the subject  matter disclosed  which is illustrated in the accompanying  drawings     0016  Method 100 provides the capability to dynamically   in real time and without user intervention  recover from a  failed reboot or IPL of a POWERS Server  The recovery  action is based on a new option in the S HMC allowing the  user to select the type of action the Service Processor will take  in the event of a boot failure or IPL failure 110 120  The  actions may be to  maintain an order of selection ofa selected  boot device and a plurality of other devices usable for an OS  boot 121  deallocate the failed boot device 122  reduce the    Dec  31  2009    priority of the boot device in the bootlist 123  remove the boot  device from the bootlist 124  or to take no action allowing user  to manually fix the boot or IPL problem  One action may be  selected and the action may be changed at anytime     0017  When the SP detects a failure to boot or IPL from a  boot device 130  the SP will check the action requested for a  failure to boot the POWERS server 110  The SP will then  notify the user 160 and once SP has selected and executed the  recovery action 110  the SP will initiate a reboot of the  POWERS server 140  If for some reason the reboot fails  again  SP will take the same action using the reduced pri
3.  of a boot success  or taking no action  allowing for manual user intervention     100    A    110 receiving a user selected option of an action upon an event of a failure    Se a a Se    120 complying with the user selected option without real time user  intervention  further including    121 maintaining an order of selection of a selected boot device and a j  plurality of reduced priority boot devices on the bootlist i          130 detecting a failure          140 attempting a reboot of the server and an IPL with the selected boot  device       I  150 detecting success or failure of the selected boot device  Ema ssl  160 notifying a user of the success or failure of the selected boot device  I                170 updating the order of selection of boot devices on the bootlist           180 selecting a reduced priority boot device from the bootlist               190 attempting a reboot of the server and an IPL with the reduced priority  boot device       200 continuing the reboot of the server and the IPL attempts using the  reduced priority boot devices from the bootlist until detection of a boot success  and an IPL success       210 detecting no further boot devices available on the bootlist to successfully  complete the boot success and the IPL success            220 notifying a user of the failure    Patent Application Publication Dec  31  2009 US 2009 0327813 A1    100    110 receiving a user selected option of an action upon an event of a failure    120 complying with t
4.  of selection of boot devices on the bootlist  selecting  areduced priority boot device from the bootlist  attempting a  reboot of the server and an IPL with the reduced priority boot  device  continuing the reboot of the server and the IPL  attempts using the reduced priority boot devices from the    US 2009 0327813 Al    bootlist until detection of a boot success and an IPL success   detecting no further boot devices available on the bootlist to  successfully complete the boot success and the IPL success   and  notifying a user of the failure  receiving a user selected  option of an action upon an event of a boot device failure   complying with the user selected option without real time  user intervention  further including  maintaining a bootlist for  a server of a plurality of boot devices  further including   maintaining an order of selection of a selected boot device  and a plurality of reduced priority boot devices on the bootlist   deallocating a failed boot device  changing a priority ofa boot  device on the bootlist  removing a boot device from the boot   list  detecting a boot device failure  attempting a reboot of the  server with the selected boot device  detecting success or  failure of the selected boot device  notifying a user of the  success or failure of the selected boot device  updating the  order of selection of boot devices on the bootlist  selecting a  reduced priority boot device from the bootlist  attempting a  reboot of the server with the reduced pr
5.  system  OS  is an open standards based  UNIX  operating system that allows a user to run the desired appli   cations on IBM UNIX OS based servers  The boot devices  and the Service Processor  SP  are the POWERS hardware of  special interest to this invention     0003  Each POWERS server in the DS8000 contains two  boot devices  One boot device maybe be designated the pri   mary boot device  and the other a mirror boot device  The boot  devices contain the boot files to load the AIX kernel  the AIX  Operating System and the DS8000 application program that  performs the function necessary to manage the storage con   troller  In addition  the boot devices contain configuration  files that are used to manage the hardware resources of the  storage controller     0004  The Service Processor  SP  isa POWERPC control   ler embedded in a POWERS server  The SP contains applica   tion programs and device drivers that are required for the  functionality of the service processor hardware  The SP appli   cation programs are used to manage and monitor the  POWERS server hardware resources and devices  The SP  provides functions to manage the automatic power re start of  a POWERS server  to manage the selection of the boot  devices  and the capability to modify the boot list  The auto  restart  reboot  option  when enabled  may reboot the system  automatically following an unrecoverable hardware or soft   ware related failure     0005  The SP provides several distinct surveillance func   ti
6. US 20090327813A1    a2  Patent Application Publication  o  Pub  No   US 2009 0327813 A1    as  United States    Coronado et al      43  Pub  Date  Dec  31  2009        54  METHOD TO RECOVER FROM A BOOT  DEVICE FAILURE DURING REBOOT OR  SYSTEM IPL    75  Inventors  Juan A  Coronado  Tucson  AZ    US   Aaron E  Taylor  Tucson  AZ    US   Christina A  Lara  Tucson    AZ  US   David W  Sharik    Tucson  AZ  US   Justin D  Suess    Tucson  AZ  US   Phu Nguyen    Tucson  AZ  US   Richard   Cunningham  Tucson  AZ  US     Adote A  Tounou  Tucson  AZ  US     Correspondence Address    IBM CORPORATION  ACCSP   c o Suiter Swantz pc llo   14301 FNB Parkway  Suite 220  Omaha  NE 68154  US     International Business Machines  Corporation  Armonk  NY  US      73  Assignee      21  Appl  No   12 146 087     22  Filed  Jun  25  2008    Publication Classification    61  Int Cl     GO6F 11 22  2006 01   GO6F 11 20  2006 01   52  U S  Cl wacacx  714 36  714 E11 145  714 E11 071   57  ABSTRACT    A method of automatic recovery from a boot device failure  and an initial program load  IPL  failure of an operating  system  OS  comprises  receiving and complying with a user  selected option of an action upon an event of a boot device  failure and an IPL failure  The user selected option may  consist of taking the action of attempting an auto reboot of the  server with the selected boot device and continuing the reboot  attempts using the reduced priority boot devices from the  bootlist until detection
7. and include such  changes     1  A method of automatic recovery of an operating system   OS  from at least one of a boot device failure and an initial  program load  IPL  failure  comprising    receiving a user selected option of an action upon an event   of a failure    complying with the user selected option without real time   user intervention  further including    maintaining a bootlist for a server of a plurality of boot   devices  further including    maintaining an order of selection of a selected boot  device anda plurality of reduced priority boot devices  on the bootlist    deallocating a failed boot device    reducing a priority of a boot device on the bootlist    removing a boot device from the bootlist    detecting a failure    attempting a reboot of the server and an IPL with the   selected boot device    detecting success or failure of the selected boot device    notifying a user of the success or failure of the selected boot   device    updating the order of selection of boot devices on the   bootlist     US 2009 0327813 Al    selecting a reduced priority boot device from the bootlist    attempting a reboot of the server and an IPL with the  reduced priority boot device    continuing the reboot of the server and the IPL attempts  using the reduced priority boot devices from the bootlist  until detection of a boot success and an IPL success     Dec  31  2009    detecting no further boot devices available on the bootlist  to successfully complete the boot succe
8. for  some reason a failure occurs in the running server  This also  affects overall system performance because the DS8000 per   formance has been fine tuned with the availability of both  Power 5 servers in mind  Just as important  this could also  increase warranty costs  as it can take several hours to provide  on site support  including diagnosis of the problem  This in  turn could bring several more hours as the hardware used as  boot devices  HDD  are not normal pieces of hardware car   ried by a customer engineer     SUMMARY     0011  A method of automatic recovery of an operating  system  OS  from at least one of a boot device failure and an  initial program load  IPL  failure including  but not limited to   receiving a user selected option of an action upon an event of  a failure  complying with the user selected option without real  time user intervention  further including  maintaining a boot   list for a server of a plurality of boot devices  further includ   ing  maintaining an order of selection of a selected boot  device and a plurality of reduced priority boot devices on the  bootlist  deallocating a failed boot device  reducing a priority  of a boot device on the bootlist  removing a boot device from  the bootlist  detecting a failure  attempting a reboot of the  server and an IPL with the selected boot device  detecting  success or failure of the selected boot device  notifying a user  of the success or failure of the selected boot device  updating  the order
9. he user selected option without real time user  intervention  further including    121 maintaining an order of selection of a selected boot device and a  plurality of reduced priority boot devices on the bootlist       130 detecting a failure    140 attempting a reboot of the server and an IPL with the selected boot  device    190 attempting a reboot of the server and an IPL with the reduced priority  boot device    200 continuing the reboot of the server and the IPL attempts using the    reduced priority boot devices from the bootlist until detection of a boot success  and an IPL success    210 detecting no further boot devices available on the bootlist to successfully  complete the boot success and the IPL success       220 notifying a user of the failure    FIG  1    US 2009 0327813 Al    METHOD TO RECOVER FROM A BOOT  DEVICE FAILURE DURING REBOOT OR  SYSTEM IPL    TECHNICAL FIELD     0001  The present disclosure generally relates to the field  of computer server recovery  and more particularly to a  method that provides the capability for a computer server   such as the POWERS Server  to dynamically recover in real  time without user intervention from a failed reboot     BACKGROUND     0002  The IBM DS8000 storage controller is a dual pro   cessor complex controller  The dual processor complex con   troller includes two POWERS based servers  A POWERS  server contains processors and memory needed to run the  applications that manage the DS8000 functions  The AIX  operating
10. iority boot device   continuing the reboot attempts using the reduced priority boot  devices from the bootlist until detection of a boot success   detecting no further boot devices available on the bootlist to  successfully complete the boot success  and notifying a user  of the boot device failure  The method embodies a user  selected option comprises  taking an action of attempting a  reboot of a server with a selected boot device and continuing  reboot attempts using a reduced priority boot devices from a  plurality of boot devices on a bootlist until detection of a boot  success  or  taking no action allowing for user manual inter   vention  The method embodies an additional aspect where the  failure detected is an initial program load of an operating  system     0012  Itis to be understood that both the foregoing general  description and the following detailed description are exem   plary and explanatory only and are not necessarily restrictive  of the present disclosure  The accompanying drawings  which  are incorporated in and constitute a part of the specification   illustrate subject matter of the disclosure  Together  the  descriptions and the drawings serve to explain the principles  of the disclosure     BRIEF DESCRIPTION OF THE DRAWINGS     0013  The numerous advantages of the disclosure may be  better understood by those skilled in the art by reference to the  accompanying figures in which     0014  FIG  1 is a flow diagram illustrating a method to  dynamically 
11. ons  One SP surveillance function monitors the functional   ity of the firmware during the boot process  while another  surveillance function monitors the health of the Operating  System  The surveillance functions allow the SP to take  appropriate action when the monitor function detects a fail   ure  The SP can deallocate or deconfigure a hardware  resource  The hardware resources deallocated or deconfig   ured are bypassed during the boot process     0006  In general  the configuration and management of  each POWERS server and the DS8000 hardware resources  takes place through a Storage Hardware Management   S HMC  console  The S HMC is a stand alone workstation  that provides both a Graphical User Interface  GUI  and a  Command Line Interface  CLI   In reality  the S HMC inter   faces with the SP to provide a user the capability to perform  configuration  resource management  and maintenance  activities on a POWERS server and additional hardware  resources of the DS8000 storage controller  The S HMC con   sole provides among several functions the capability to    Dec  31  2009    remotely control the power management of each POWERS  server  such as the ability to manage the power of a POWERS  server     0007  The S HMC  the POWERS server and the Service  Processor are interconnected through two 16 port Ethernet  switches  Each POWERS server  Service Processor  and the  S HMC connects to each switch  This configuration provides  for a fully redundant management network    
12. ority  boot device on the bootlist 180  SP will continue to take  actions 200  until the SP detects that it is impossible to boot  the system because there are no longer boot devices available  to successfully complete a boot or IPL of the OS and its  application 210  In the event that no devices are available to  boot the system  Method 100 will notify the user of the failure  220     0018  In the present disclosure  the methods disclosed  may be implemented as sets of instructions or software read   able by a device  Further  it is understood that the specific  order or hierarchy of steps in the methods disclosed are  examples of exemplary approaches  Based upon design pref   erences  it is understood that the specific order or hierarchy of  steps in the method can be rearranged while remaining within  the disclosed subject matter  The accompanying method  claims present elements of the various steps in a sample order   and are not necessarily meant to be limited to the specific  order or hierarchy presented     0019  Itis believed that the present disclosure and many of  its attendant advantages will be understood by the foregoing  description  and it will be apparent that various changes may  be made in the form  construction and arrangement of the  components without departing from the disclosed subject  matter or without sacrificing all of its material advantages   The form described is merely explanatory  and it is the inten   tion of the following claims to encompass 
13. ss and the IPL  success  and    notifying a user of the failure     x x x x x    
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
GBT-SCA Updates and chip results - Indico  Philips AJ3840/00M User's Manual  guia stimulus face  ダウンロード  Pelco IXS0LW surveillance camera  D-Link N450 User's Manual  T'nB BBHOLD31  BEDIENUNGSANLEITUNG  Indesit UIAA 10  FurReal Friends Baby Butterscotch 52194 Instructions    Copyright © All rights reserved. 
   Failed to retrieve file