Sun Microsystems, Inc.
4150 Network Circle
Santa Clara, CA 95054 U.S.A.
Part No. 816-5075-12
January 2003, Revision A
Send comments about this document to: docfeedback@sun.com
Copyright 2003Sun Microsystems, Inc.,4150 NetworkCircle, SantaClara, California95054, U.S.A.All rightsreserved.
Sun Microsystems, Inc.has intellectualproperty rightsrelating to technology embodied in the product that is described in this document. In
particular,and without limitation, these intellectual property rightsmay includeone ormore ofthe U.S.patents listedat
http://www.sun.com/patents, and one or moreadditional patentsor pendingpatent applicationsin theU.S. andin othercountries.
This document and the productto whichit pertainsare distributedunder licensesrestricting their use, copying, distribution, and
decompilation. No part of the product orof thisdocument maybe reproducedin any form by any means without prior written authorization of
Sun and its licensors, if any.
Third-partysoftware, includingfont technology,is copyrighted and licensed from Sun suppliers.
Parts of the product maybe derivedfrom BerkeleyBSD systems,licensed fromthe University of California. UNIX is a registered trademarkin
the U.S. and other countries, exclusively licensed through X/OpenCompany,Ltd.
Sun, Sun Microsystems,the Sunlogo, AnswerBook2,docs.sun.com, andSolaris aretrademarks, registeredtrademarks, or service marks of Sun
Microsystems,Inc. inthe U.S.and othercountries.
All SPARCtrademarks areused underlicense andare trademarks or registered trademarksof SPARCInternational, Inc.in theU.S. andother
countries. Products bearingSPARCtrademarks arebased upon an architecture developedby SunMicrosystems, Inc.
The OPEN LOOK and Sun™ Graphical User Interface was developed by Sun Microsystems,Inc. forits usersand licensees.Sun acknowledges
the pioneering effortsof Xeroxin researchingand developing the concept of visual or graphical user interfaces for the computer industry.Sun
holds a non-exclusive license fromXerox tothe XeroxGraphical User Interface, which license also covers Sun’s licensees who implement OPEN
LOOK GUIs and otherwise comply with Sun’s written license agreements.
Use, duplication, ordisclosure bythe U.S.Government is subject to restrictionsset forthin theSun Microsystems, Inc.license agreementsand as
providedin DFARS 227.7202-1(a) and 227.7202-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (Oct. 1998), FAR12.212(a)(1995), FAR52.227-19,or
FAR52.227-14 (ALT III), as applicable.
DOCUMENTATIONIS PROVIDED “AS IS” AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONSAND WARRANTIES,
INCLUDING ANY IMPLIED WARRANTYOF MERCHANTABILITY,FITNESS FORA PARTICULARPURPOSE OR NON-INFRINGEMENT,
ARE DISCLAIMED, EXCEPT TO THEEXTENT THAT SUCH DISCLAIMERS ARE HELD TO BELEGALLYINVALID.
Copyright 2003 Sun Microsystems, Inc.,4150 NetworkCircle, SantaClara, California95054, Etats-Unis.Tousdroitsréservés.
Sun Microsystems, Inc.a lesdroits depropriété intellectuels relatants à la technologie incorporée dans le produit quiest décritdans ce
document. En particulier,et sans la limitation, ces droits depropriété intellectuelspeuvent inclureun ou plus des brevetsaméricains énumérés
à http://www.sun.com/patentset unou lesbrevets plussupplémentaires ou les applications de brevet enattente dansles Etats-Uniset dans
les autres pays.
Ce produit oudocument estprotégé parun copyrightet distribuéavec deslicences quien restreignentl’utilisation, la copie, la distribution, et la
décompilation. Aucune partie de ce produit oudocument nepeut êtrereproduite sous aucune forme, parquelquemoyen quece soit,sans
l’autorisation préalable et écrite de Sun et de ses bailleurs de licence, s’il y ena.
Le logiciel détenu par des tiers, et qui comprendla technologierelative auxpolices decaractères, est protégé par un copyright et licencié par des
fournisseurs de Sun.
Des parties de ce produitpourront êtredérivées des systèmes Berkeley BSD licenciés par l’Université de Californie. UNIX est une marque
déposée aux Etats-Unis et dans d’autres payset licenciéeexclusivement parX/Open Company,Ltd.
Sun, Sun Microsystems,le logoSun, AnswerBook2,docs.sun.com, etSolaris sontdes marquesde fabriqueou desmarques déposées de Sun
Microsystems,Inc. auxEtats-Unis etdans d’autrespays.
Toutes les marquesSPARCsont utiliséessous licenceet sontdes marquesde fabrique ou des marquesdéposées deSPARCInternational, Inc.
aux Etats-Unis et dans d’autrespays. Lesproduits protantles marques SPARC sont basés sur une architecture développéepar Sun
Microsystems,Inc.
L’interfaced’utilisation graphique OPEN LOOK et Sun™ a été développée par Sun Microsystems, Inc.pour sesutilisateurs etlicenciés. Sun
reconnaîtles effortsde pionniersde Xeroxpour la rechercheet ledéveloppment duconcept desinterfaces d’utilisationvisuelle ougraphique
pour l’industrie de l’informatique. Sun détient une license non exclusive do Xerox surl’interface d’utilisationgraphique Xerox,cette licence
couvrant également les licenciées de Sun qui mettent en place l’interface d ’utilisation graphique OPEN LOOK et qui en outre seconforment
aux licences écrites de Sun.
LA DOCUMENTATIONESTFOURNIE "EN L’ÉTAT"ET TOUTES AUTRESCONDITIONS, DECLARATIONSETGARANTIES EXPRESSES
OU TACITESSONT FORMELLEMENTEXCLUES, DANSLA MESUREAUTORISEE PARLA LOIAPPLICABLE, YCOMPRIS NOTAMMENT
TOUTEGARANTIE IMPLICITERELATIVEA LAQUALITE MARCHANDE,A L’APTITUDE A UNE UTILISATIONPARTICULIEREOU A
L’ABSENCEDE CONTREFAÇON.
Please
Recycle
Contents
Prefacevii
Before You Read This Bookvii
How This Book Is Organizedvii
Using UNIX Commandsviii
Typographic Conventionsix
Shell Promptsix
Related Documentationx
Accessing Sun Documentation Onlinex
Sun Welcomes Your Commentsx
1.Introduction to DR on the Sun Fire 15K/12K Server1
What Is DR?1
Where You Execute DR Commands1
Command Line Interface (CLI)2
Graphical User Interface (GUI)2
Automatic DR2
Enhanced System Availability3
DR Concepts3
Detachability3
Quiescence4
iii
Suspend-Safe and Suspend-Unsafe Devices4
Attachment Points5
Conditions and States6
DR Operations6
Hot-Plug Hardware7
Sun Fire 15K/12K Domains7
Component Types8
DR on I/O Boards8
Solving a Problem With an I/O Device9
Golden IOSRAM9
DR on hsPCI+ I/O Boards10
Permanent and Non-permanent Memory10
Target Memory Constraints10
Correctable Memory Errors11
Capacity on Demand (COD)11
DR on COD Boards11
Enabling DR on Domains Running the Solaris 8 2/02 Operating Environment12
An Illustration of DR Concepts12
2.DR State and Condition Models15
Board States and Conditions15
Board Slot States16
Board Occupant States16
Board Conditions17
Component States and Conditions17
Component Receptacle States17
Component Occupant States17
Component Conditions18
ivSun Fire 15K/12K Dynamic Reconfiguration (DR) User Guide • January 2003
3.DR Operations and Software Components on the Domain19
DR Operations19
Before You Perform DR Operations19
Before Performing DR Operations on I/O Boards20
Connect Operation20
Configure Operation22
CPUs and Memory22
I/O Boards23
After the Configure Operation23
Disconnect Operation23
Unconfigure Operation24
Non-permanent Memory24
Permanent Memory24
Software Components26
Domain Configuration Server26
DR Driver27
Reconfiguration Coordination Manager27
System Events Framework27
4.DR User Interfaces on the Domain29
DR Commands and Options on the Domain29
State Change Functions30
Availability Change Functions30
Condition Change Functions30
Options and Operands31
5.DR Domain Procedures33
Attachment Points33
Displaying Board Status34
v
Basic Status Display34
Detailed Status Display34
Removing a Board35
▼To Remove a CPU/Memory Board35
▼To Remove an I/O Board36
Adding a Board37
▼To Install a Board37
DR Using cfgadm(1M) - Examples39
Displaying Help39
Displaying Verbose Messages39
Suppressing User Confirmation40
Power Control When Disconnecting Boards40
Power Control of Disconnected Boards40
Connecting and Configuring Boards41
Hot Plugging PCI Adapter Cards41
Testing a Board42
Displaying Attachment Point Information42
Tracking Memory Unconfigure Operations43
Finding the Board Containing Permanent Memory43
viSun Fire 15K/12K Dynamic Reconfiguration (DR) User Guide • January 2003
Preface
This book describes the Dynamic Reconfiguration (DR) feature of the Sun™ Fire 15K
and Sun Fire 12K systems. DR enables you to attach system boards to and detach
them from Sun Fire 15K/12K domains while the Solaris operating environment
continues to run.
Before You Read This Book
This book is intended for the Sun Fire 15K/12K system administrator who has a
working knowledge of UNIX® systems, particularly those based on the Solaris™
operating environment. If you do not have such knowledge, first read the Solaris
user and system administrator books provided with this system and consider UNIX
system administration training.
How This Book Is Organized
This book contains the following chapters:
Chapter 1 “Introduction to DR on the 15K/12K Server”
Chapter 2 “DR State and Condition Models”
Chapter 3 “DR Operations and Software Components on the Domain”
Chapter 4 “DR User Interfaces on the Domain”
Chapter 5 “DR Domain Procedures”
vii
Using UNIX Commands
This document may not contain information on basic UNIX® commands and
procedures such as shutting down the system, booting the system, and configuring
devices.
See one or more of the following for this information:
■ Online documentation for the Solaris™ software environment
■ Other software documentation that you received with your system
viiiSun Fire 15K/12K Dynamic Reconfiguration (DR) User Guide • January 2003
Typographic Conventions
Typeface or
Symbol
AaBbCc123The names of commands, files,
AaBbCc123What you type, when
AaBbCc123Book titles, new words or terms,
MeaningExamples
Edit your .login file.
and directories; on-screen
computer output
contrasted with on-screen
computer output
words to be emphasized
Command-line variable; replace
with a real name or value
Use ls -a to list all files.
% You have mail.
% su
Password:
Read Chapter 6 in the User’s Guide.
These are called class options.
You must be superuser to do this.
To delete a file, type rm filename.
Shell Prompts
ShellPrompt
C shellmachine_name%
C shell superusermachine_name#
Bourne shell and Korn shell$
Bourne shell and Korn shell superuser#
DR Webpagehttp://www.sun.com/servers/highend/dr_su
System Management Services (SMS) 1.3
Dynamic Reconfiguration User Guide
System Management Services (SMS) 1.3
Administrator Guide
Solaris 9 4/03 Release Notes Supplement
for Sun Hardware
Release Notes
nfire
816-7723
816-5319
817-1106
816-5321
n/a
Accessing Sun Documentation Online
You can view and print a broad selection of Sun(TM) documentation, including
localized versions, at:
http://www.sun.com/documentation
You can also purchase printed copies of select Sun documentation from iUniverse,
the Sun documentation provider, at:
http://corppub.iuniverse.com/marketplace/sun
Sun Welcomes Your Comments
Sun is interested in improving its documentation and welcomes your comments and
suggestions. You can email your comments to Sun at:
docfeedback@sun.com
Please include the part number of this document (816-5075-12) in the subject line of
your email.
xSun Fire 15K/12K Dynamic Reconfiguration (DR) User Guide • January 2003
CHAPTER
1
Introduction to DR on the Sun Fire
15K/12K Server
This chapter contains descriptions about general concepts that pertain to the
Dynamic Reconfiguration (DR) feature on the Sun Fire 15K and Sun Fire 12K servers.
What Is DR?
DR on the Sun Fire 15K/12K server enables you to perform hardware configuration
changes to a live domain that is running the Solaris operating environment, without
causing machine downtime. You can also use DR in conjunction with hot-swap to
physically add boards to or remove them from the server.
Where You Execute DR Commands
You can execute DR operations from the Sun Fire 15K/12K system controller (SC) by
using the system management services (SMS) commands: addboard(1M),
moveboard(1M), deleteboard(1M), and rcfgadm(1M); or from the domain by
using the cfgadm(1M) command. DR operations using SMS commands are
described in Chapter 5, “DR Domain Procedures.”
Note – If the addboard(1M), moveboard(1M), deleteboard(1M), rcfgadm(1M),
or cfgadm(1M) command fails during a DR operation, the board does not return to
its original state. A dxs or dca error message is logged to the domain. If the error is
recoverable, you can retry the command. If the error is unrecoverable, you must
reboot the domain to use the board.
1
Command Line Interface (CLI)
The DR software has a command line interface through the cfgadm(1M) command,
which is the configuration administration program. The DR agent also provides a
remote interface to the Sun Management Center 3.0 software.
Graphical User Interface (GUI)
The optional Sun Management Center 3.0 Platform Update 4 software, which is
designed for these systems, provides features such as domain management, as well
as a graphical user interface (GUI) where you perform DR operations. If you prefer
to use a graphical user interface instead of a command line interface, use the Sun
Management Center 3.0 software.
To use the Sun Management Center 3.0 Platform Update 4 software, you must attach
the system controller board to a network. With a network connection, you can view
both the command line interface and the graphical user interface. For instructions on
how to use the Sun Management Center 3.0 Platform Update 4 software, refer to the
Sun Management Center 3.0 User’s Guide, shipped with the Sun Management Center
3.0 Platform Update 4 software. For instructions on how to connect the system
controller to a network connection on the system controller board, see your systems
installation documentation.
Automatic DR
Automatic DR enables an application to execute DR operations without requiring
user interaction. This ability is provided by an enhanced DR framework that
includes the reconfiguration coordination manager (RCM) and the system event
facility, called sysevent. The RCM enables application-specific loadable modules to
register callbacks. The callbacks perform preparatory tasks before a DR operation,
error recovery during a DR operation, or clean-up after a DR operation. The
sysevent facility enables applications to register for system events and receive
notifications of those events. The automatic DR framework interfaces with the RCM
and with the sysevent facility to enable applications to automatically give up
resources prior to unconfiguring them and to capture new resources as they are
configured into the domain.
2Sun Fire 15K/12K Dynamic Reconfiguration (DR) User Guide • January 2003
Enhanced System Availability
The DR feature enables you to hot-swap system boards without bringing the server
down. It is used to unconfigure the resources on a faulty system board from a
domain so that the system board can be removed from the server. The repaired or
replacement board can be inserted into the domain while the Solaris operating
environment continues to run. DR then configures the resources on the board into
the domain. If you use the DR feature to add or remove a system board or
component, DR always leaves the board or component in a known configuration
state. See Chapter 2 “DR State and Condition Models” for more information about
configuration states for system board and components.
DR Concepts
This section contains descriptions of general DR concepts that pertain to Sun Fire
15K/12K domains. For more information about DR concepts on the SC, refer to the
System Management Services (SMS) 1.3 Dynamic Reconfiguration User Guide.
Detachability
For a device to be detachable, it must conform to the following items:
■ The device driver must support DDI_DETACH.
■ Critical resources must be redundant or accessible through an alternate pathway.
CPUs and memory banks can be redundant critical resources. Disk drives are
examples of critical resources that can be accessible through an alternate pathway.
Some boards cannot be detached because their resources cannot be moved. For
example, if a domain has only one CPU board, that CPU board cannot be detached.
An I/O board is not detachable if it controls the boot drive.
If there is no alternate pathway for an I/O board, you can:
■ Put the disk chain on a separate I/O board. The secondary I/O board can then be
detached.
■ Add a second path to the device through a second I/O board so that the I/O
board can be detached without losing access to the secondary disk chain.
Note – If you are unsure whether a device is detachable, consult your Sun service
representative.
Chapter 1Introduction to DR on the Sun Fire 15K/12K Server3
Quiescence
During the unconfigure operation on a system board with permanent memory
(OpenBoot™ PROM or kernel memory), the operating environment is briefly
paused, which is known as operating environment quiescence. All operating
environment and device activity on the domain must cease during this critical phase
of the operation.
Before it can achieve quiescence, the operating environment must temporarily
suspend all processes, CPUs, and device activities. If the operating environment
cannot achieve quiescence, it displays the reasons, which may include the following:
■ An execution thread did not suspend.
■ A device exists that cannot be paused by the operating environment.
Note – Real-time processes do not prevent quiescence.
The conditions that cause processes to fail to suspend are generally temporary.
Examine the reasons for any failure, and if the operating environment encountered a
failure to suspend a process, simply try the operation again.
Suspend-Safe and Suspend-Unsafe Devices
When DR suspends the operating environment, all of the device drivers that are
attached to the operating environment must also be suspended. If a driver cannot be
suspended (or subsequently resumed), the DR operation fails.
A suspend-safe device does not access memory or interrupt the system while the
operating environment is in quiescence. A driver is suspend-safe if it supports
operating environment quiescence (if it can be suspended and then resumed). A
suspend-safe driver also guarantees that when a suspend request is successfully
completed, the device that the driver manages will not attempt to access memory,
even if the device is open when the suspend request is made.
A suspend-unsafe device allows a memory access or a system interruption to occur
while the operating environment is in quiescence.
DR uses an unsafe driver list in the dr.conf file to prevent unsafe devices from
accessing memory or interrupting the operating environment during a DR
operation. The dr.conf file resides in the following directory:
/platform/SUNW,Sun-Fire-15000/kernel/drv/. The unsafe driver list is a
property in the dr.conf file with the following format:
4Sun Fire 15K/12K Dynamic Reconfiguration (DR) User Guide • January 2003
DR reads this list when it prepares to suspend the operating environment so that it
can unconfigure a memory component. If DR finds an active driver in the unsafe
driver list, it aborts the DR operation and returns an error message. The message
includes the identity of the active, unsafe driver. You must manually remove the
usage of the device by performing one, or more, of the following tasks.
■ Kill the processes using the device.
■ Unload the driver by using the modunload(1M) command.
■ Disconnect the cables (depending on the type of device).
You can retry the DR operation after you have stopped usage of the device.
Note – If you are unsure whether a device is suspend-safe, contact your Sun service
representative.
Attachment Points
An attachment point is a collective term that refers to a board slot, a system board
installed in the slot, and any devices connected to the board. DR can display the
status of the board, the board slot, and the attachment point. The term occupant
refers to the combination of a board and its attached devices.
■ A board slot (sometimes referred to as a receptacle) has the ability to electrically
isolate the occupant from the host machine. The software can put a board slot into
low-power mode.
■ Board slots can be named according to slot numbers, or can be anonymous (for
example, a SCSI chain).
■ An occupant I/O board includes any external storage devices connected by
interface cables.
There are two types of names for attachment points:
■ A physical attachment point describes the software driver and location of the slot.
Examples of physical attachment point names are:
/devices/pseudo/dr@0:SBx (for a CPU/memory board in slot 0)
-OR-
/devices/pseudo/dr@0:IOx (for an I/O board or Max CPU board in slot 1)
Where, x represents the expander number (0 through 17 on the Sun Fire 15K
system, and 0 through 8 on the Sun Fire 12K system) for a particular board.
Chapter 1Introduction to DR on the Sun Fire 15K/12K Server5
Note – CPU/memory boards are installed only in slot 0. I/O boards and Max CPU
boards are installed only in slot 1.
■ A logical attachment point is an abbreviated name created by the system to refer
to the physical attachment point. Logical attachment points take one of the
following two forms:
SBx (for CPU/memory boards in slot 0)
-OR-
IOx (for I/O boards or Max CPU boards in slot 1)
To obtain a list of all available logical attachment points, use the cfgadm(1M)
command with its -l option.
Conditions and States
A state is the operational status of either a board slot or its occupant. A condition is
the operational status of an attachment point. The cfgadm(1M) command can
display nine types of states and conditions. See Chapter 2, “DR State and ConditionModels,” for descriptions of the conditions and states for system boards and
components.
DR Operations
There are four main types of operations related to boards: connection, configuration,
unconfiguration, and disconnection. A board that is brought into a domain is first
connected and then configured. A board that is removed from a domain is first
unconfigured and then disconnected.
During the connect operation, the system provides power to the slot, and the
operating environment begins monitoring the board’s temperature.
During the configure operation, the operating environment assigns functional roles
to the board, and loads device drivers for the board and for devices attached to it.
During the unconfigure operation, the system detaches the board logically from the
operating environment and takes the associated device drivers offline.
Environmental monitoring continues, but devices on the board are not available for
system use.
During the disconnect operation, the system stops monitoring the board and power
to the slot is turned off.
6Sun Fire 15K/12K Dynamic Reconfiguration (DR) User Guide • January 2003
To power-off a board that is in use (configured), first stop its use (unconfigure it),
and then disconnect it from the domain. After a new or upgraded system board is
inserted into the slot, connect the board and configure it.
The cfgadm(1M) command can connect and configure (or unconfigure and
disconnect) in a single command. To connect and configure a board using a single
command, see the section“Adding a Board” on page 37. To unconfigure and
disconnect a board using a single command, see the section“Removing a Board” on
page 35.
If necessary, each operation (connect, configure, unconfigure, or disconnect) can be
performed separately using the cfgadm(1M) command.
Hot-Plug Hardware
Hot-plug boards and modules have special connectors that supply electrical power
to the board or module before the data pins make contact. Boards and devices that
do not have hot-plug connectors cannot be inserted or removed while the system is
running.
I/O boards and CPU/memory boards used in the Sun Fire 15K/12K server are hotplug devices. Some devices, such as the peripheral power supply, are not hot-plug
modules and cannot be removed while the system is running.
Sun Fire 15K/12K Domains
The Sun Fire 15K/12K server can be divided into dynamic system domains, which
are comprised of logical and physical groupings of system board slots. Each domain
is electrically isolated into hardware partitions, which ensures that a problem
encountered in one domain cannot affect other domains.
Domain configuration is determined by the domain configuration table in the
platform configuration database (PCD), which resides on the SC. The domain table
controls how system board slots are logically partitioned into domains. The domain
configuration represents the intended domain configuration. Thus, the configuration
can include empty slots and occupied slots.
The number of slots available to a given domain is controlled by an available
component list that is maintained on the system controller. (Refer to the SystemManagement Services (SMS) 1.3 Administrator Guide for more information about the
available component list.) After a slot has been assigned to a domain, it becomes
Chapter 1Introduction to DR on the Sun Fire 15K/12K Server7
visible to that domain and unavailable and invisible to any other domain.
Conversely, you must disconnect and unassign a slot from its domain before you can
assign and connect it to another domain.
The logical domain is the set of slots that belong to the domain. The physical domain
is the set of boards that are physically interconnected. A slot can be a member of a
logical domain and not be part of a physical domain.
After a domain is booted, the system boards and empty slots can be assigned to (or
unassigned from) a logical domain; however, they cannot become a part of the
physical domain until the operating environment requests it.
System boards or slots that are not assigned to a domain are available to all domains
in whose available component lists they appear. These boards can be assigned to a
domain by the platform administrator. Or, an available component list can be set up
on the system controller to allow users with appropriate privileges to assign
available boards to a domain.
Component Types
You can use DR to configure or to unconfigure several types of components:
Component TypeDescription
cpuAn individual CPU
memoryAll of the memory on the board
pciAny I/O device, controller, or bus
DR on I/O Boards
You must use caution when you add or remove I/O boards to which devices are
attached. Before you can remove a board with I/O devices, all of its devices must be
closed and all of its file systems must be unmounted.
If you need to remove an I/O board with attached devices from a domain
temporarily and then re-add it before any other boards with I/O devices are added,
reconfiguration is not necessary. In this case, device paths to the board devices
remain unchanged.
8Sun Fire 15K/12K Dynamic Reconfiguration (DR) User Guide • January 2003
Loading...
+ 40 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.