Hardware Diagnostics for Sun TM Systems: A Toolkit for
System Administrators |
|
Have you ever stared at the ok
prompt on your Sun system, and wondered how to continue? Or have
you ever wondered why all the LEDs on the system board sometimes
appear to flash madly like a broken street light?
Look no further -- read on to find the answer
to these questions.
By using OpenBootTM
commands, the Power On Self Test (POST) program, and the status
LEDs on system boards, you can diagnose hardware related problems
on Sun Microsystems TM server and desktop
products. With these low-level diagnostics, you can establish the
state of the system and attached devices. For example, you can determine
if a device is recognized by the system and working properly, or
you can also obtain useful system configuration information.
Use this table to locate subjects in this
article:
OpenBoot Prom (OBP) Diagnostic Commands
and Tools |
Describes OBP
commands you can use to display the system configuration,
test devices attached to the system, monitor network
connections, and more. |
OBDiag |
Shows how you
can run tests and perform diagnostics on the main logic
board and its interfaces, and on devices such as disk and
tape drives. |
Power On Self Test (POST) |
Explains how
POST initializes, configures, and tests the system, and
shows you how to capture POST output and interpret the
results using the LEDs on the system board and power supply.
|
System Board and Power Supply LED Status Tables
|
Provides reference
information to help you interpret the meaning of LED status
for system boards and power supplies installed on Ultra TM Enterprise
Server products. |
Solaris Operating Environment Diagnostic Commands
|
Lists useful
OS commands you can use to display the system configuration,
including failed Field Replacable Units (FRU), hardware
revision information, installed patches, and more.
|
|
OBP DIAGNOSTIC COMMANDS
AND TOOLS
OBP is a powerful, low-level interface to
the system and devices attached to the system (OBP is also known
as the ok prompt). By entering simple OBP commands, you
can learn system configuration details such as the ethernet address,
the CPU and bus speeds, installed memory, and so on. Using OBP,
you can also query and set system parameter values such as the default
boot device, run tests on devices such as the network interface,
and display the SCSI and SBUS devices attached to the system.
The following table describes commands available
in OpenBoot version 3. x. To use a command, simply type the command
at the OBP ok prompt and press Return.
banner
|
Displays the power on
banner. The banner includes information such as CPU speed,
OBP revision, total system memory, ethernet address and
hostid. |
devalias alias path
|
Defines a new device
alias, where alias is the new alias name and path is the physical
path of the device. If devalias is used without
arguments, it displays all system device aliases (will
run up to 120 MHz). |
.enet-addr
|
Displays the ethernet
address. |
led-off/led-on
|
Turns the system led
off or on. |
nvalias name
path |
Creates a new alias for
a device, where name is the name of the alias and
path is the physical path of the device.
Note - Run the reset-all or the nvstore
command to save the new alias in non-volatile memory (NVRAM).
|
nvunalias name path
|
Deletes a user-created
alias (see nvalias), where name is the
name of the alias and path is the physical path
of the device.
Note - Run the reset-all
or nvstore command to save changes in NVRAM.
|
nvstore
|
Copies the contents of
the temporary buffer to NVRAM and discards the contents
of the temporary buffer. |
power-off/power-on
|
Powers the system off
or on. |
printenv
|
Displays all parameters,
settings, and values. |
probe-fcal-all
|
Identifies Fiber Channel
Arbitrated Loop (FCAL) devices on a system. 1 |
probe-sbus
|
Identifies devices attached
to all SBUS slots.
Note - This command works only
on systems with SBUS slots. |
probe-scsi |
Identifies devices attached
to the onboard SCSI bus. 1 |
probe-scsi-all
|
Identifies devices attached
to all SCSI busses. 1 |
set-default parameter
|
Resets the value of parameter
to the default setting. |
set-defaults
|
Resets the value of all
parameters to the default settings.
Tip - You can also press the
Stop and N keys simultaneously during system power-up
to reset the values to their defaults.
|
setenv parameter value
|
Sets parameter
to specified value.
Note - Run the reset-all
command to save changes in NVRAM. |
show-devs
|
Displays all the devices
recognized by the system. |
show-disks |
Displays the physical
device path for disk controllers. |
show-displays
|
Displays the physical
device path for frame buffers. |
show-nets |
Displays the physical
device path for network interfaces. |
show-post-results
|
If run after Power On
Self Test (POST) is completed, this command displays the
findings of POST in a readable format. |
show-sbus |
Displays devices attached
to all SBUS slots. Similar to probe-sbus .
|
show-tapes
|
Displays the physical
device path for tape controllers. |
sifting string
|
Searches for OBP commands
or methods that contain string. For example, the
sifting probe command displays
probe-scsi, probe-scsi-all, probe-sbus
, and so on. |
.speed
|
Displays CPU and bus
speeds. |
test device-specifier
|
Executes the selftest
method for device-specifier. For example, the test
net command tests the network connection. |
test-all
|
Tests all devices that
have a built-in test method. |
.version |
Displays OBP and POST
version information. |
watch-clock
|
Tests a clock function.
|
watch-net |
Monitors the network
connection for the primary interface. |
watch-net-all
|
Monitors all the network
connections. |
words |
Displays all OBP commands
and methods. |
1 On Ultra (sun4u) systems, set
the auto-boot? variable to false , or the probe-scsi,
probe-scsi-all, and probe-fcal-all commands
will cause the system to hang. To set this variable, type
setenv auto-boot? false at the ok prompt,
then type reset-all (remember to change
the value back to true when testing is completed,
or the system will not automatically boot).
|
|
OBDIAG
OBDiag enables you to interactively run
tests and diagnostics at the OBP level on these Sun systems:
- Sun Enterprise 420R Server
- Sun Enterprise 220R Server
- Sun Ultra Enterprise 450 Server
- Sun Ultra Enterprise 250 Server
- Sun Ultra 80
- Sun Ultra 60
- Sun Ultra 30
- Sun Ultra 10
- Sun Ultra 5
OBDiag displays its test results using the
LEDs on the front system panel and on the keyboard. Use the system board and power supply LED status tables
table to interpret the results.
OBDiag also displays diagnostic and error
messages on the system console. To learn more about OBDiag, visit
docs.sun.com.
On the main logic board, OBDiag tests not
only the main logic board, but also its interfaces:
- PCI
- SCSI
- Ethernet
- Serial
- Parallel
- Keyboard/mouse
- NVRAM
- Audio
- Video
How To Run OBDiag
To run OBDiag, simply type obdiag
at the Open Boot ok prompt.
You can also set up OBDiag to run automatically
when the system is powered on using the following methods:
- Set the OBP diagnostics variable:
ok setenv diag-switch? true
- Press the Stop and D keys simultaneously
while you power on the system.
- On Ultra Enterprise servers, turn the key switch
to the diagnostics position and power on the system.
POWER ON SELF TEST (POST)
POST is a program that resides in the firmware
of each board in a system, and it is used to initialize, configure,
and test the system boards. POST output is sent to serial port A
(on an Ultra Enterprise server, POST output is sent only to serial
port A on the system and clock board). The status LEDs of
each system board on Ultra Enterprise servers indicate the POST
completion status. For example, if a system board fails the POST
test, the amber LED stays lit.
You can watch POST ouput in real-time by
attaching a terminal device to serial port A. If none is available,
you can use the OBP command show-post-results to view the results
after POST completes.
How To Run POST
- Attach a terminal device to serial
port A.
- Set the OBP diagnostics variable:
ok setenv diag-switch? true
- Set the desired testing level.
Two different levels of POST can be run,
and you can choose to run all tests or some of the tests. Set
the OBP variable diag-level to the desired level of testing (max
or min), for example:
ok setenv diag-level max
- If you wish to boot from disk, set the OBP variable
diag-device :
ok setenv diag-device disk
The system default for this variable is
net.
- Set the auto-boot variable:
ok setenv auto-boot? false
- Save the changes.
ok reset-all
Power cycle the system (turn it off, and
then back on).
POST runs while the system is powered
on, and the output is displayed on the device attached to serial
port A. After POST is completed, you can also run the OBP command
show-post-results to view the results.
SYSTEM BOARD AND POWER SUPPLY LED
STATUS TABLES
This section contains reference information
to help you understand the LED status on system boards and power
supplies installed on Ultra Enterprise Server products.
Ultra Enterprise Server Front Panel and Clock
Board LED Status
Power LED
|
Service LED
|
Cycling LED
|
Condition
|
off |
off |
off |
no power |
off |
on |
off |
failure mode |
off |
off |
on |
failure mode |
off |
on |
on |
failure mode |
on |
off |
off |
hung in POST/OBP or
OS |
on |
off |
on |
hung in OS |
on |
on |
off |
hung in POST/OBP
hung in OS/failed component |
on |
on |
on |
hung in POST/OBP
hung in OS/failed component |
on |
off |
flashing |
OS running normally
|
on |
on |
flashing |
OS running with failed
component |
on |
flashing |
off |
slow flash = POST
fast flash=OBP |
on |
flashing |
on |
OS or OBP error
|
|
Notes:
LED Name
|
Location
|
Note
|
Power LED |
Left |
Should always be on.
If all three LEDs are off, suspect a power problem.
If this LED is in any other state than on and steady,
it indicates a problem. |
Service LED |
Middle |
This LED should be
off in normal operation. If on, a component is in an
error state and you should check check individual board
LEDs. A lit service LED does not imply there is an OS-related
problem. |
Cycling LED |
Right |
This LED should be
flashing -- this is the normal state. |
|
Ultra Enterprise CPU/Memory, I/O,
and Disk Board LED Status
Power LED
|
Service LED
|
Cycling LED
|
Condition
|
off |
off |
off |
board no power
|
off |
on |
off |
low power mode - unpluggable
|
off |
off |
on |
failure mode |
off |
on |
on |
failure mode |
on |
off |
off |
hung in POST/OBP of
OS |
on |
off |
on |
hung in OS |
on |
on |
off |
hung in POST/OBP
hung in OS and failed component on board |
on |
on |
on |
hung in POST/OBP
hung in OS/failed component on board |
on |
off |
flashing |
OS running normally
|
on |
on |
flashing |
OS running normally/failed
component on board |
on |
flashing |
off |
slow flash = POST
fast flash = OBP |
on |
flashing |
on |
OS or OBP error
|
Notes: Low Power Mode - If the
status of the LEDs on the board is off-on-off, this
means the board is in low power mode. This occurs when
the board is disabled because it failed POST, or if
the board was just inserted. Low power mode is the only
state in which you may unplug the board while the system
is running.
Disk Boards - The amber LED on
disk boards installed in Ultra Enterprise servers will
remain on when the Ultra Enterprise server is running
Solaris 2.6 5/98 or above. This is normal, and it
indicates the board is in low power mode (the board can be
removed from the system provided the disks have been
idled). |
|
Power Supply LED Status
LEDs are used on the power supply to
report an error condition such as power supply or fan failure.
Power supplies are hot-pluggable, but the Solaris Operating Environment
halts the system if insufficient power is detected. Generally,
a system is configured with a power supply for each system board.
Green LED
|
Yellow LED
|
Condition
|
off |
off |
No AC input or keyswitch
is turned off |
on |
off |
normal operation
|
on |
on |
Fan failure or one
or more voltages out of specification |
off |
on |
One or more DC outputs
failed, or voltages out of specification, or system
in low power state |
|
SOLARIS OPERATING ENVIRONMENT DIAGNOSTIC COMMANDS
The following table describes OS commands
you can use to display the system configuration, such as failed
Field Replaceable Units (FRU), hardware revision information,
installed patches, and so on.
/usr/platform/sun4u/sbin/prtdiag -v
|
Displays
system configuration and diagnostic information,
and lists any failed Field Replaceable Units (FRU).
|
/usr/bin/showrev
[-p] |
Displays
revision information for the current hardware
and software. When used with the -p option,
displays installed patches. |
/usr/sbin/prtconf |
Displays
system configuration information. |
/usr/sbin/psrinfo
-v |
Displays
CPU information, including clock speed. |
|
RELATED LINKS
Using
Device Path Names to Identify System Devices: Eliminate the Guesswork
Establish the hardware configuration
of your system using the OpenBootTM
device tree.
|
|